Hybrid Brain Architecture
Hybrid Brain Architecture (v3.0)
The core of Signal.Engine is a multi-layered reasoning system known as the Hybrid Brain. Moving beyond simple binary classifiers, this architecture treats market analysis as a sequence-modeling problem, combining the predictive power of Deep Learning with the disciplined decision-making of Reinforcement Learning.
The "Brain" functions as a three-stage pipeline: Perception (LSTM), Reasoning (SFT), and Optimization (PPO).
1. Temporal Perception: Sequence Modeling (LSTM)
The foundation of the brain is a Long Short-Term Memory (LSTM) network designed to process time-series data as a continuous narrative rather than isolated snapshots.
- Window Logic: The model ingests the last 50 candles (trading periods) to identify momentum, support/resistance levels, and volatility clusters.
- Feature Set: It processes a multi-dimensional vector including price action (
Close,Log_Return) and technical oscillators (RSI,MACD). - Internal Stability: Utilizes Batch Normalization and Dropout layers to maintain inference stability during high-volatility market regimes.
2. The Teacher: Supervised Fine-Tuning (SFT)
Before the agent is allowed to trade, it undergoes Supervised Fine-Tuning. This phase builds "Market Common Sense."
- The Golden Dataset: The model is trained on hindsight-labeled data using a ZigZag Labeler. By looking at past charts, the teacher shows the model exactly where the "perfect" trades were.
- Pattern Recognition: This phase achieves ~77% directional accuracy, teaching the agent to recognize classic trend reversals and breakout patterns.
3. The Strategist: Reinforcement Learning (PPO)
While the LSTM learns direction, the Proximal Policy Optimization (PPO) layer learns strategy. This is where the agent develops its "trading personality."
- Risk-Adjusted Returns: The PPO agent is trained in a vectorized GPU environment where it is rewarded not just for profit, but for maintaining a high Sharpe Ratio and low Maximum Drawdown.
- Action Space: The agent chooses between three discrete actions:
0: SELL / SHORT (Bearish conviction)1: HOLD / NEUTRAL (Preserving capital during noise)2: BUY / LONG (Bullish conviction)
- Reward Shaping: The agent is penalized for over-trading (transaction costs) and rewarded for holding winning positions through trend extensions.
4. Heuristic & Risk Overlay
The final decision is filtered through a Heuristic Expert layer. This layer provides the "Rational" seen in the dashboard, ensuring the AI's "black box" decisions align with quantitative risk parameters.
- Regime Detection: Monitors market mood (e.g., "VOLATILE" vs "CALM") to adjust position sizing.
- Confidence Scoring: Every trade is assigned a confidence value (0.0 to 1.0). Trades below the configured threshold (typically 0.85) are automatically discarded.
Interacting with the Brain
As a developer, you interact with the Hybrid Brain primarily through the API or the specialized trading scripts.
Programmatic Access (Python)
To trigger the brain's reasoning process manually within the environment:
from src.brain.hybrid import HybridBrain
brain = HybridBrain()
# The .think() method runs the full LSTM + PPO inference pipeline
decisions = brain.think()
for trade in decisions:
print(f"Ticker: {trade['Ticker']}")
print(f"Action: {trade['Action']}")
print(f"Confidence: {trade['Confidence']:.2%}")
print(f"Rational: {', '.join(trade['Rational'])}")
API Access (REST)
The frontend communicates with the brain via the FastAPI backend. You can poll the latest "thoughts" from the brain using the following endpoint:
Endpoint: GET /api/results
Response Structure:
{
"status": "success",
"data": [
{
"Ticker": "RELIANCE.NS",
"Action": "BUY",
"Confidence": 0.92,
"Rational": ["Strong Momentum", "Low Volatility Regime", "RSI Divergence"]
}
],
"is_thinking": false
}
Configuration & Checkpoints
The Brain’s behavior is determined by the weights stored in the checkpoints/ directory.
final_sft_model.pth: The "Knowledge Base" (Directional accuracy).best_ppo.ckpt: The "Execution Logic" (Risk management and timing).
To update the brain's intelligence, replace these files with newly trained weights from train_ppo_optimized.py.