DEFENDIS: Decentralized Federated Learning for IoT Device Identification and Security
This project implements a comprehensive framework designed to uniquely identify each device deployed in an IoT platform in a distributed and robust manner, solving possible security threats based on device impersonation or malicious deployment.
Project Overview
The DEFENDIS (DEcentralized FEderated learNing for IoT Device Identification and Security) project was developed in collaboration with the Federal Office for Defence Procurement armasuisse from April 2023 to November 2023. The framework addresses critical security challenges in IoT environments by combining hardware device fingerprinting with fully distributed machine learning model generation.
The proposed framework is based on hardware device fingerprinting and the fully distributed generation of ML/DL models to identify these devices as well as possible malicious elements affecting the identification process robustness. Besides, the monitoring of the processes running on the device is also considered as a contextual data source to be employed during environment securitization.
Main Objectives
1. Unique Device Identification
To provide a solution to uniquely identify each of the sensors of a crowdsensing or Industrial IoT (IIoT) platform in a reliable manner, strongly solving possible sensor impersonation security threats. In this sense, the solution needs to monitor also contextual information such as running processes, temperature, or CPU load in order to adjust the parameters of the generated fingerprint according to its context. The privacy-preserving management and exchange of device fingerprints and models is based on Federated Learning (FL).
2. Adversarial Attack Resilience
To apply adversarial attacks against the solution and identify their proper countermeasures, improving its resilience against possible malicious actors taking part in the federation. These attacks will target both the fingerprint generation and the FL model training and deployment process, so the complete solution lifecycle is secured.
3. Decentralized Federated Learning Framework
To develop a fully Decentralized FL (DFL) framework for ML/DL model generation, enabling model training and distributing the fingerprints across different stakeholders without the requirement of sharing sensitive information or having a central entity managing the aggregation of the models, reducing the bottleneck and attack surface of having a centralized server.
4. Trust and Robustness Metrics
To analyze the main trust and robustness metrics related to the FL model generation process and integrate them into the framework developed in the previous point. Some metrics considered are robustness, privacy, fairness, accountability, and explainability.
Technical Architecture
Decentralized Federated Learning
The core of our system is a fully decentralized federated learning architecture that eliminates the need for a central server:
class DecentralizedFL:
def __init__(self, network_topology):
self.nodes = network_topology
self.local_models = {}
self.consensus_weights = self.initialize_weights()
def train_round(self, node_id, local_data):
# Local training
local_model = self.train_local_model(node_id, local_data)
# Consensus update with neighbors
neighbor_models = self.get_neighbor_models(node_id)
updated_model = self.consensus_update(local_model, neighbor_models)
return updated_model
Hardware Device Fingerprinting
Our system creates unique digital signatures for IoT devices based on multiple hardware characteristics:
- Hardware Features: CPU usage patterns, memory utilization, temperature profiles
- Network Behavior: Packet timing, connection patterns, protocol distributions
- Application Behavior: Process execution patterns, API call sequences
- Contextual Information: Running processes, temperature, CPU load
The fingerprinting algorithm uses a combination of statistical analysis and machine learning:
$$F_i = \{f_1, f_2, ..., f_n\}$$
Where $F_i$ represents the fingerprint of device $i$ and $f_j$ are individual features.
Privacy-Preserving Techniques
We implement several privacy-preserving mechanisms:
- Differential Privacy: Adding calibrated noise to model updates
- Secure Aggregation: Cryptographic protocols for secure model combination
- Local Processing: All sensitive data remains on local devices
- Decentralized Architecture: No central server required for model aggregation
Implementation Details
Core Components
- Decentralized FL Engine: Manages the fully distributed training process
- Hardware Fingerprinting Module: Extracts and analyzes device characteristics
- Contextual Monitoring System: Tracks running processes and environmental data
- Adversarial Attack Detection: Identifies and mitigates malicious actors
- Trust Metrics Framework: Implements robustness, privacy, fairness, accountability, and explainability metrics
Technology Stack
- Backend: Python 3.9+, PyTorch, NumPy, SciPy
- Federated Learning: Custom decentralized implementation with PyTorch
- Cryptography: PyCryptodome for secure aggregation
- API: FastAPI for RESTful endpoints
- Database: PostgreSQL for metadata storage
- Deployment: Docker containers with Kubernetes orchestration
Real-World Applications
∫ The framework has been deployed in several environments:
- Smart City Infrastructure: Securing traffic sensors and environmental monitors
- Industrial IoT: Protecting manufacturing equipment and control systems
- Healthcare IoT: Securing medical devices and patient monitoring systems
- Home Automation: Protecting smart home devices and networks
Challenges and Solutions
Technical Challenges
- Communication Overhead: Managing peer-to-peer communication in large networks
-
Solution: Implemented efficient gossip protocols and selective communication
-
Model Convergence: Ensuring consistent model convergence across heterogeneous devices
-
Solution: Developed adaptive learning rates and robust aggregation algorithms
-
Privacy Attacks: Protecting against model inversion and membership inference attacks
-
Solution: Implemented differential privacy with carefully tuned noise parameters
-
Adversarial Attacks: Defending against malicious actors in the federation
- Solution: Developed robust aggregation algorithms and anomaly detection systems
Deployment Challenges
- Heterogeneous Environments: Supporting diverse IoT devices and networks
-
Solution: Created modular architecture with device-specific adapters
-
Resource Constraints: Working with limited computational resources
- Solution: Optimized model architectures and implemented efficient inference
Future Directions
Planned Enhancements
- Blockchain Integration: Immutable audit trails for model updates
- Edge Computing: Local processing capabilities for real-time response
- Cross-Silo Federated Learning: Multi-organization collaboration
- Quantum-Resistant Cryptography: Future-proof security measures
Research Opportunities
- Federated Learning for Large Language Models: Scaling to transformer architectures
- Federated Learning with Foundation Models: Personalized AI assistants
- Federated Learning at the Edge: Mobile and IoT device optimization
Conclusion
The DEFENDIS framework represents a significant advancement in IoT security, enabling collaborative threat detection while preserving data privacy through fully decentralized federated learning. The hardware fingerprinting approach, combined with robust privacy-preserving techniques and adversarial attack resilience, provides a comprehensive solution for securing IoT deployments across various domains.
The project demonstrates the potential of decentralized federated learning to address real-world security challenges while respecting privacy and regulatory requirements. As IoT continues to grow, such privacy-preserving approaches will become increasingly important for maintaining security without compromising user privacy.
This project was developed in collaboration with the Federal Office for Defence Procurement armasuisse from April 2023 to November 2023. For more information about the implementation or potential collaborations, please contact me at [email protected].