Projects

1. LLM-GMP: Zero-Shot Graph Learning

Published 2025 | Co-authored with Md Athikul Islam, Edoardo Serra

The Challenge: Traditional GNNs rely heavily on labeled datasets, which are scarce in many real-world domains. The Solution: I developed Large Language Model-Based Message Passing (LLM-GMP), a framework that leverages the semantic reasoning of LLMs to propagate information across graph nodes without task-specific fine-tuning.

🛠 Engineering Implementation

  • Inference Optimization: Deployed local quantization of large-scale models (Llama-3-70b/Mistral) to reduce VRAM requirements by 40% while maintaining reasoning accuracy.
  • Infrastructure: Orchestrated a custom Linux KVM environment with GPU passthrough to isolate inference tasks from standard compute workloads.
  • Tech Stack: PyTorch, HuggingFace Transformers, Docker, Unraid (Hypervisor).

Download Paper (PDF) | View Source Code (GitHub)

Python

# Concept: Semantic Message Passing Wrapper
class LLMMessagePassing(nn.Module):
    def __init__(self, llm_backbone, hidden_dim):
        super().__init__()
        self.llm = llm_backbone
        # Projector to map LLM embeddings to graph space
        self.projector = nn.Linear(llm_backbone.hidden_size, hidden_dim)

    def forward(self, graph, node_feats):
        # 1. Generate semantic reasoning features
        reasoning = self.llm.generate_reasoning(node_feats)
        
        # 2. Propagate reasoning across the graph structure
        graph.ndata['h'] = reasoning
        graph.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h'))
        
        # 3. Project back to task space
        return self.projector(graph.ndata['h'])

2. 2FWL-SIRGN: Scalable Graph Partitioning

Published 2024 | Co-authored with Edoardo Serra

The Challenge: The Folklore Weisfeiler-Lehman (FWL) test is a powerful isomorphism test but is computationally expensive (O(nk)), making it unusable for large-scale graphs. The Solution: I implemented a structural 2-dimensional FWL approach paired with a novel graph partitioning algorithm.

🛠 Engineering Implementation

  • Distributed Processing: Designed the partitioning algorithm to be parallelizable, allowing sub-graphs to be processed across multiple Docker containers.
  • Performance: Optimized the structural iterative representation learning (SIR-GN) core to handle graphs 10x larger than previous state-of-the-art implementations.

Download Paper (PDF) | View Source Code (GitHub)


3. Temporal SIR-GN: Dynamic Network Analysis

Published 2023 | Co-authored with Janet Layne, Edoardo Serra, Francesco Gullo

The Challenge: Standard GNNs struggle to capture structural patterns that evolve over time (temporal dynamics). The Solution: Temporal SIR-GN extends node representational learning into the temporal dimension, effectively tracking how graph structures mutate.

🛠 Engineering Implementation

  • Efficiency: Engineered the algorithm to rival state-of-the-art GNNs while consuming significantly less computational power.
  • Data Pipeline: Built automated data ingestion pipelines to process temporal snapshots of complex network datasets.

Download Paper (PDF) | View Source Code (GitHub)


4. Botnet Node Detection

Published 2021 | Co-authored with Janet Layne, Edoardo Serra, Alfredo Cuzzocrea

The Challenge: Detecting malicious botnet nodes within massive streams of internet traffic without overfitting to specific attack patterns. The Solution: Applied Structural Node Representation Learning to identify harmful structures based on connection topology rather than just packet contents.

🛠 Engineering Implementation

  • Real-Time Analysis: Designed the model to operate at less than half the computational cost of existing intrusion detection systems, making it viable for real-time traffic monitoring.
  • Robustness: Achieved high detection rates while explicitly preventing overfitting, a common failure point in security ML models.

Download Paper (PDF) | View Source Code (GitHub)