AI Model Limitations

An honest account of what our AI can and cannot do — Last updated: April 2026

Tradewink uses artificial intelligence to screen markets, score setups, evaluate trade candidates, and explain decisions in plain English. This page documents the limitations of that technology. Read it before enabling any automated trading feature. If anything on this page would surprise you, the product is not safe for you to use yet.

Which models we use

Tradewink routes AI work to Claude (Anthropic), GPT (OpenAI), and Gemini (Google) through OpenRouter. The specific model assigned to a task varies by subscription tier, request type, and availability. No single model is treated as authoritative.

Models are general-purpose large language models. They are not purpose-built financial forecasting systems. They have been trained on public data with a knowledge cutoff, which means they have no access to real-time information unless we explicitly feed it to them as context. Any market data, news, filings, or price history cited in an analysis came from our data providers (Polygon, Finnhub, SEC EDGAR, etc.) — not from the model's training data.

Hallucination risk

LLMs can and do generate plausible-sounding but factually incorrect statements. This is a known, well-documented property of the underlying technology. It is not a bug we will eliminate — it is a characteristic of the tool. We mitigate it by grounding prompts in structured data from trusted providers, by using multi-agent debate for high-stakes decisions, and by logging every AI response for post-trade review.

Despite those mitigations, an AI-generated analysis may contain incorrect ticker references, wrong dollar amounts, fabricated news events, misremembered earnings dates, or invented company names. You should never place a trade based solely on an AI explanation without verifying the underlying data yourself.

Why backtests are not predictions

Tradewink publishes backtest and paper-trading results for several strategies and LLM benchmarks. These results are historical simulations, not promises about future performance. They are subject to the standard limitations of hypothetical results: they benefit from hindsight, they can be overfit to the specific period tested, they do not account for the full cost of real-world slippage and partial fills, and they assume execution at prices that may not have been available in size.

Survivorship bias is a real factor. Strategies that looked good in backtests have repeatedly failed in live trading. A strategy that worked for five years can stop working tomorrow if the underlying market regime changes. Past performance is not an indicator of future results — not because the phrase is required by regulators, but because it is literally true.

'Self-improving' means 'continuously adjusted', not 'continuously better'

Tradewink includes an adaptive strategy selector — a Thompson Sampling multi-armed bandit (often loosely called 'reinforcement learning', though it is not deep RL) that reweights eight rule-based strategies based on recent performance, conditioned on market regime and time of day. It also includes an ML retrainer that refits gradient-boosted models on a rolling window, and a dynamic exit engine that adjusts stops in real time. We describe this as 'self-improving' because the system updates itself without manual intervention.

It does not mean the system gets monotonically better over time. A model that retrains on recent data can overfit to a regime that has already ended. A bandit selector can get trapped favoring a strategy whose edge has already decayed. An auto-tuned stop can be too tight in a volatile session. We guard against the worst failure modes with validation gates, walk-forward testing, and manual review of new strategies before they go live — but we cannot guarantee that any update will improve results.

Confidence scores are probabilistic, not guaranteed

Many signals and trade evaluations include a confidence score (typically 0-100). This score reflects the AI's internal assessment of how well the setup matches its criteria — it is not a probability of profit. A signal with 95% confidence is not 95% likely to be profitable. It is a signal where the AI judged its own criteria to be strongly met.

High-confidence signals still lose. Low-confidence signals still win. The score is useful for ranking and filtering, not for predicting individual trade outcomes.

Human oversight is required by default

Tradewink ships with trade confirmation required by default. The system will identify a candidate, generate an analysis, and wait for your explicit approval before placing any order. You can opt into fully autonomous execution, but doing so requires an additional confirmation step and you accept full responsibility for every resulting trade.

We designed the product this way on purpose. The AI is best treated as a tireless research assistant that scans markets you do not have time to watch, surfaces setups you would not have found, and drafts a rationale you can accept or reject. The final decision should remain yours.

Bottom line

AI is a research accelerator, not a profit machine. It will make mistakes. It will miss regime changes. It will occasionally generate confident-sounding analysis about a ticker or event that does not exist. Treat every AI-generated output as a starting point for your own review, not as a recommendation to act.

Tradewink is not a registered investment adviser, broker-dealer, or financial planner. All data, signals, and analytics on this page are for informational purposes only and do not constitute investment advice, financial advice, or a recommendation to buy or sell any security.

Past performance does not guarantee future results. Trading involves substantial risk of loss, including the possibility of losing more than your initial investment. You are solely responsible for your own trading decisions.