
Overview: Designed and implemented a novel hierarchical computational approach to solve the "curse of dimensionality" in high-dimensional oncology genomic data. By hybridizing Ant Colony Optimization (ACO) and the Firefly algorithm, we created an advanced feature selection pipeline capable of isolating critical breast cancer biomarkers from massive genomic datasets.
Architecture & Execution:
Pipeline Design: Engineered a two-stage evaluation system: a statistical pre-filtering layer utilizing Mutual Information, followed by a robust Wrapper evaluation layer.
Swarm Orchestration: Utilized ACO for global exploration to identify promising genetic subspaces, while deploying the Firefly algorithm for aggressive local exploitation to prune redundant features.
Mathematical Regularization: Mitigated evolutionary overfitting—a common issue in microarray analysis—by mathematically bounding the search space. This was achieved through strict cardinality penalties and the integration of a heavily regularized LinearSVC evaluator.
Validation: Quantitatively validated the swarm's stigmergic emergence using Shannon Entropy decay metrics.
Impact & Results:
Achieved massive dimensional compression, reducing the pre-filtered genome from 200 features down to an optimal subset of just 13 critical biomarkers.
Delivered high predictive accuracy on an isolated test set (F1-Macro = 0.9440) for breast cancer subtype classification.
Mathematically demonstrated the superiority of a hierarchical, hybridized architecture over isolated metaheuristics, prioritizing clinical viability through the Principle of Parsimony.
I'm open to new opportunities. Let's discuss how I can bring this level of engineering to your team.
Let's Talk