Applied Data Mining & Computational Vision
The Problem
Four assignments from the Data Mining & Computational Vision module, unified into a single coherent portfolio. Rather than treating them as disconnected tasks, I approached each one as a study in a distinct corner of applied AI: graph-structured data, unstructured text, image quality degradation, and spatial recognition in augmented reality.
The four problems were: mining network structure from a movie dataset, classifying agricultural advertisements using NLP, enhancing dark images for downstream vision tasks, and building a landmark recognition system that works in an AR context. Together they cover the breadth of what "data mining and computer vision" actually means in practice.
My Approach
Each project required a different framing. Network analysis is about relationships between entities, not attributes of individual ones. Text classification requires representing language numerically without losing semantic meaning. Image enhancement is a signal processing problem with a perceptual quality target. Landmark recognition under AR constraints requires robustness to viewing angle, lighting, and occlusion.
I worked through each problem independently before looking for connections — and found them in the shared need for a principled evaluation strategy, and in the consistent requirement to justify design choices rather than just report numbers.
What I Built
Part 1 — Network Movies Analysis
The movie dataset was modelled as a graph: nodes represent films or actors (depending on the analysis), edges represent connections (shared actors, sequels, genre overlap). NetworkX was used to compute graph-theoretic properties — degree distribution, clustering coefficients, shortest paths, and centrality measures.
The analysis revealed which films act as hubs in the network (high betweenness centrality), which clusters of movies are tightly connected within genres, and how far the network deviates from a random graph — a property relevant to understanding how recommendations and word-of-mouth propagate through film audiences.
Assignment 3 — Farm Advertisement Text Mining
The task was to classify farm-related advertisements by category using natural language processing. Text preprocessing included lowercasing, punctuation removal, stop-word filtering, and stemming. Features were extracted using TF-IDF vectorisation, which weights terms by their document-relative frequency rather than raw count — reducing the dominance of common words.
Classification models were trained on the TF-IDF representation and evaluated by precision, recall, and F1 per category. The analysis also included exploratory text mining to surface the most discriminative terms for each advertisement class — giving a human-readable explanation of what the model learned to look for.
Assignment 4 — Dark Image Enhancement
Low-light images suffer from noise, loss of detail in shadow regions, and colour distortion — all of which degrade the performance of downstream computer vision systems. The goal was to recover perceptually and computationally useful images from dark inputs.
I implemented and compared classical enhancement techniques (histogram equalisation, CLAHE — contrast-limited adaptive histogram equalisation) against deep learning approaches. CLAHE, which applies equalisation locally in tiles rather than globally, avoided the artefact-heavy outputs that global equalisation produces in high-contrast scenes. The deep learning pipeline was trained end-to-end to map dark inputs to clean references, with the model learning spatial detail recovery implicitly.
Output quality was assessed using PSNR (peak signal-to-noise ratio) and SSIM (structural similarity index), which measure pixel-level accuracy and perceptual structure preservation respectively. A full 6.1 MB report documented the methodology and results with visual comparisons across all approaches.
Assignment 5 — AR Landmark Recognition
The task was to build a system that recognises landmarks (buildings, sculptures, signage) from images captured in an augmented reality context — which means handling variable viewing angles, partial occlusion, reflections, and changing light conditions that would break a naive template-matching approach.
Feature extraction used local descriptors robust to affine transformations. A matching pipeline compared query image features to a reference database and returned the best candidate along with a confidence estimate. The AR overlay component drew recognition results directly onto the live frame.
The 3.1 MB notebook and 2.1 MB report document the full pipeline, including ablation experiments testing which descriptor and matching strategy performed best under different degradation conditions.
Notebooks
4
Reports
4 PDFs
Domains
Graph · NLP · CV · AR
Results
The network analysis surfaced clear hub structure in the movie graph — a small number of highly-connected films acted as bridges between otherwise separate clusters, consistent with the scale-free properties observed in real-world entertainment networks.
The text classifier achieved strong per-category F1 scores on the farm advertisement dataset, with the most discriminative features being domain-specific agricultural terms — validating that TF-IDF captured meaningful signal rather than noise.
CLAHE significantly outperformed global histogram equalisation on dark image enhancement, particularly in preserving local contrast in shadow regions without over-amplifying noise. The deep learning model improved further on CLAHE, recovering fine texture detail that classical methods lost.
The AR landmark system achieved reliable recognition across the test set under standard conditions, with performance degrading gracefully as occlusion and viewpoint variation increased — a more realistic result than the near-perfect numbers sometimes reported on clean benchmarks.
Note: full metric tables are in the individual assignment reports linked below.
What I Learned
Working across four different problem types in one submission forced me to confront how domain-specific "good evaluation" really is. PSNR and SSIM mean nothing for graph analysis; F1 per class means nothing for image quality. Choosing the right evaluation criterion is as important as choosing the right model, and it requires actually understanding the problem rather than applying a standard template.
The image enhancement project gave me a durable mental model for when classical methods beat deep learning: when the transformation is well-understood mathematically, has limited degrees of freedom, and the dataset is small. CLAHE has a clear prior — local contrast normalisation — that deep models have to discover from scratch. On small training sets, that prior wins.
The AR project was where I most directly confronted the gap between academic benchmarks and real-world deployment. Recognition performance on clean, frontal, well-lit images is not a useful number. The useful number is how much performance degrades under the conditions users will actually create — and designing test sets that reflect that is harder than building the model.