Colophon - League Donation

The ranked list is a kind of fiction.

That isn’t a complaint. It’s a design observation. When a fantasy tool presents a numbered list from player 1 to player 300, it’s encoding a claim of precision that the underlying data doesn’t support. The sources disagree. The projections diverge. The ADP reflects market sentiment that may or may not track real value. Averaging those inputs and presenting the result as a settled order resolves the uncertainty on the user’s behalf, and that resolution is where the signal goes to die.

League Donation starts from a different premise: the uncertainty is the information.

When FantasyPros ECR, FanGraphs Steamer, and the connected platform’s internal rankings converge on a player at 35th, the consensus is probably right. When they diverge by 40 positions, that divergence tells you something the composite doesn’t. A player ranked 35th by consensus and 75th by Statcast-based projections is a question worth asking. A player ranked 35th by consensus but going in the 80s in NFBC drafts is a market inefficiency. The tool was built to surface those gaps rather than collapse them into a clean number that papers over what the sources are actually saying.

Everything downstream follows from that premise.

Philosophy

The tool is free. That isn’t an incidental feature. It’s consequential to the integrity of what the tool does.

A monetized analytics tool has a structural conflict it can’t fully escape. Premium tiers create pressure to withhold the most useful features. Advertising creates implicit incentives to keep users engaged rather than informed. Affiliate relationships can subtly tilt recommendations in directions that are commercially convenient. None of this requires conscious intent. It’s embedded in the incentive structure. The tool avoids these pressures by having no revenue model. The analysis is accountable only to whether it’s right. That’s the only constraint worth having.

The Data Layer

The tool supports two sports, baseball and football, across four platforms: ESPN, Yahoo, Fantrax, and Sleeper. Each platform uses different data structures, different player ID systems, different roster representations, and different scoring category schemas. Each sport on each platform introduces further variation. A tool that works across all of them can maintain parallel analytical codebases per combination or define a common schema and write adapters into it.

The adapter approach is more expensive to build and dramatically easier to maintain. Each platform’s raw API response flows through its adapter and emerges as a normalized LeagueData object with a consistent shape: league settings, team rosters, schedule, draft history, scoring categories, roster slot counts, and metadata. The analytics layer never touches the raw platform response. It only reads the normalized output. Every player object carries the same fields regardless of platform: positions, platformId, draftRank, proTeam, dateOfBirth, percentOwned. Every roster entry carries slot (a position string like ‘SP’ or ‘WR’ or ‘BN’) and acquisitionType. The raw platform shape is fully consumed inside the adapter for normal operation. ESPN’s settings retry is the explicit exception: when the initial settings response is incomplete, the raw ESPN payload is held briefly under S._rawEspn so a second fetch can merge richer fields and the adapter can re-run against the augmented payload. That field is read only by ESPN-specific code paths.

This separates the data translation problem from the analysis problem. When Yahoo changes its API response format, one function changes. When a new platform is added, a new adapter is written and the entire pipeline inherits it without modification. The normalization logic is also auditable in isolation, which matters for verifying that the tool is reading the data correctly.

ESPN was the first platform integrated and its adapter carries the most translation complexity. ESPN’s slot map uses integer IDs rather than position strings. Its player position eligibility is encoded as an integer array of slot IDs rather than a positions array. Its projection stats use a nested structure with statSourceId and statSplitTypeId selectors. Its innings pitched are stored as total outs rather than decimal innings. And several stat ID slots contain values that disagree with their own STAT_NAMES keys: stat 8 is TB, not IBB. Stat 9 is SLG, not HBP. Stat 7 is singles, not walks. The adapter handles all of this: integer-to-string slot resolution, position extraction from eligibility arrays, stat correction for misaligned IDs, IP encoding conversion, and pro team abbreviation lookup from numeric proTeamId. The slot map lives inside the adapter file. Nothing outside the adapter touches ESPN’s raw field names, by design.

ESPN’s API also presents preseason extraction challenges. Roster settings are sometimes incomplete, returning category totals without individual slot counts or omitting positions with zero players assigned. The tool handles this through a two-pass fetch: initial settings from the primary endpoint, followed by a secondary fetch that often carries fuller data. When the secondary response contains better roster or scoring information, it merges into the raw response and triggers renormalization through the adapter. The normalized schema is always downstream of the freshest available data.

Yahoo’s adapter has its own dialect. Yahoo’s stat IDs disagree with ESPN’s, so the adapter maps them to the canonical set the pipeline uses. Yahoo provides positions as strings rather than slot integers, which is one less translation. Yahoo’s OAuth flow introduces a lifecycle problem the ESPN connection doesn’t have: access tokens expire after one hour, so the server-side proxy silently refreshes them using the stored refresh token and passes new credentials back to the client. If a user returns after the four-hour window has elapsed, Yahoo asks them to authorize again, which is the price of using OAuth that takes its expiry seriously.

Fantrax uses string IDs for both teams and players, so the tool assigns sequential numeric team IDs at normalization time and keeps a bidirectional map. Fantrax has no draft rank or ownership-sorted ranking field, so when a Fantrax JSESSIONID is available the adapter pulls real ADP from the live draft endpoint; without it, FantasyPros ECR becomes the sole ranking authority and the platform is excluded from composite calculation. Fantrax’s roster response carries enough context to infer acquisition type from the draft picks data, so the keeper calculator and roster badges work without a dedicated transactions endpoint.

Sleeper is the simplest adapter and the most analytically consequential for football. The API is public, JSON-native, and needs no authentication for read access. A username resolves to a user ID, the user ID resolves to leagues, each league resolves to rosters, matchups, drafts, and transactions through flat endpoints. There is no OAuth dance, no cookie relay, no HTML scraping. What makes Sleeper more than a fourth integration is that its stat vocabulary is the canonical one the football pipeline uses internally. Sleeper names its stats in compact string keys (pass_yd, rush_td, rec, rec_yd, fum_lost), and every football stat inside the analytics layer is keyed the same way. ESPN’s integer football stat IDs, Yahoo’s string football stat IDs, and Fantrax’s scoring labels all map through their adapters into Sleeper’s keys before the composite pipeline sees them. A Sleeper league is therefore a zero-translation case: the raw scoring response is already speaking the tool’s native football vocabulary. The other three adapters do the translation work into that vocabulary on behalf of their platforms.

All four adapters stamp a platform field on the normalized output. Every function downstream that needs platform-specific behavior reads S.league.platform rather than maintaining its own detection. The source of truth is in one place.

Four raw API shapes converge to one normalized schema. Nothing downstream knows which platform produced the data.

The Composite Ranking Pipeline

A source is any input that assigns ranked positions or projected statistics to players. FantasyPros ECR is a source. A FanGraphs Steamer CSV is a source. The connected platform’s internal rankings are a source. NFBC ADP is a source. Each measures something slightly different.

ECR is aggregated expert opinion, a consensus of analysts who have thought carefully about how the season will go, adjusted for positional context and draft format. Projection systems like Steamer are statistical models built on historical performance, aging curves, park factors, and regression to mean. They know nothing about narrative or reputation. ADP is a market price, what actual participants in real drafts have been paying for players. It carries behavioral patterns, recency bias, league-size effects, and the accumulated heuristics of the fantasy-playing population. These three things diverge for structural reasons. Experts may underweight regression toward the mean on breakout candidates. Projection models don’t capture role uncertainty or injury history that isn’t yet in the stats. ADP reflects what the market believes, which is often a lagged version of what the analysis shows.

The composite pipeline takes all active sources, weights them according to their sourceType and user configuration, and builds a composite ranking. But the visible output isn’t only the composite. Every player’s record carries each source’s rank alongside the composite rank so the user can see where the sources agree and where they don’t. A player with tightly clustered source ranks is well-priced by consensus. A player with a 40-position spread between sources is a live question. The spread is the story, and hiding it inside an averaged number defeats the purpose.

Tight clustering means consensus. Wide spread means the sources disagree. Both are visible alongside the composite.

Source weighting is configurable. FantasyPros consensus projections load by default because they are format-agnostic statistical models that provide immediate z-score and VORP signal without requiring any user action. ECR supplements them as a consensus ordering layer and serves as the fallback when projection data is unavailable. A user who imports a specific projection system from FanGraphs is adding league-calibrated weights on top of that base layer. Exposing the weighting as a choice means admitting the output isn’t objective. It isn’t.

The boot sequence is deliberately tiered. On initial load, the tool fetches FantasyPros consensus projections, real stat lines aggregated across multiple systems, before anything else. These drive z-scores, VORP, and tier clustering immediately. ECR loads alongside them as a secondary ordering signal and fallback. The user has a statistically functional tool within seconds. The platform connection, extended player pool fetches, and supplemental data requests all happen asynchronously behind a UI that’s already interactive. The slow work happens behind a fast surface.

Statistical Methodology

Z-scores normalize player projections across scoring categories by measuring each player’s projected contribution as a deviation from the position-eligible player mean, in units of standard deviation. The normalization is necessary because raw counting stats aren’t comparable across categories. A projection of 30 home runs and a projection of 40 stolen bases can’t be directly summed into a value number without first converting them to a common scale.

The critical implementation detail is that z-scores are calibrated to your league’s draftable pool, not to a global population. The mean and standard deviation for each category are calculated from the set of players who would realistically be drafted in a league of your size and format. Standard z-score implementations often use fixed sample sizes or global pools that don’t reflect your actual draft context. Recalculating from the actual draftable pool means that positional depth, team count, and roster construction all influence the underlying distribution. A 12-team league and an 8-team league playing identical categories have different z-score distributions because the relevant player pool is different.

VORP extends z-score analysis by establishing a replacement-level baseline for each position. The replacement player is identified as the last positionally-eligible player who would realistically be drafted in a league of your size, the best player available after the draft ends. Every player’s value is then measured as surplus above that floor rather than above zero or above the mean.

The positional scarcity effect becomes visible here. A catcher with average offensive production may carry a VORP that exceeds an above-average outfielder’s, because the replacement level at catcher is significantly lower than at outfield. The scarcity is real. The depth difference is real. Rankings that ignore positional context will systematically undervalue scarce positions and overvalue deep ones, and the error compounds through the draft as positional runs develop. VORP handles this correctly by measuring each player against the actual alternative available at their position.

VORP measures surplus above what you could replace the player with. Scarcity makes the same production worth more at thin positions.

Tier clustering groups players into bands where the projected gap within a tier is smaller than the uncertainty in the projections themselves. The practical implication: worrying about player 12 versus player 15 within the same tier is a category error, because the projections that produced those ranks don’t support that level of precision. The tier boundary is where discrimination actually becomes meaningful. Two players in adjacent tiers are genuinely different by what the data can see. Two players within the same tier are not, and treating them as different produces decisions whose confidence exceeds their evidentiary basis.

The naive implementation divides the overall value range by a fixed constant and calls each jump of that size a tier break. One elite player at the top expands the range. The threshold scales to match. Real separations in the middle compress into a single band because the outlier inflated the denominator. The tool uses natural breaks instead. The algorithm identifies the N largest actual gaps in the sorted value distribution and cuts there, regardless of where those gaps fall in the overall spread. The breaks reflect the data’s own structure rather than an arithmetic fraction of its range. An outlier at the top gets its own tier. The separations that actually exist in the middle become boundaries. The ones that don’t, aren’t.

Naive tiers divide the range evenly. Natural breaks cut at actual gaps in the distribution.

Empirical Bayes shrinkage addresses the projection reliability problem for players with limited track records. A 23-year-old with 200 career plate appearances might project for a .340 batting average based on a small sample that happened to go well. The naive projection carries far more uncertainty than its precision implies. Shrinkage pulls extreme projections toward the population mean, with the magnitude of pull inversely proportional to sample size. Players with extensive major-league history are pulled only slightly. Players with limited history are pulled substantially. The resulting projections are less impressive but more defensible, and they reduce the drafting errors that come from treating small-sample performance as established quality.

The VORP calculation uses a greedy positional assignment algorithm to handle players with positional eligibility at multiple positions. Rather than assigning each player to their primary position, the algorithm maximizes total VORP across the roster by considering the full scarcity landscape. A player eligible at both second base and shortstop is assigned to whichever position maximizes the team’s collective surplus above replacement. This produces better draft recommendations than naive primary-position assignment in leagues with unusual positional depth distributions.

For baseball, dynasty valuation extends redraft VORP by applying age-derived multipliers from position-specific aging curves. A 26-year-old hitter at peak production receives a full multiplier. A 34-year-old with the same current VORP receives a discount that reflects the empirical rate at which production declines with age. Pitchers age differently than hitters, with earlier peaks and steeper declines, so the curves are separate. The resulting dynasty VORP is a single number that captures both present production and expected trajectory. It isn’t a projection of future performance. It’s a discount function applied to current value, which is a different and more defensible claim. Prospect rankings from MLB.com are integrated into the composite and assigned synthetic dynasty VORP based on ranking position, since prospects without major-league stats have no redraft value to adjust. Football dynasty valuation takes a different path, described in the football signal layer section.

The Baseball Signal Layer

Baseball and football share everything above this line. The composite pipeline, z-scores, VORP, tier clustering, and empirical Bayes shrinkage are sport-neutral and run unchanged on either. The two sports diverge here, at the signal layer, because the public data available to ground an analytical correction is different for each in kind, not degree. Baseball has Statcast. Football does not have a public equivalent, and the football signal layer is described in the next section. What follows applies to baseball.

Batting average depends on two things: contact quality and whether that contact found fielders. The second variable is close to random. Balls in play drop for hits or find gloves based on defensive positioning, spray direction, park geometry, and outcomes that aren’t repeatable in any meaningful sense. Over a season the noise largely cancels. Over a week, or a month, it doesn’t.

Statcast tracks exit velocity and launch angle on every batted ball. From those inputs, expected batting average is the historical hit rate on balls with similar velocity and angle profiles, independent of where any specific ball actually landed. A line drive at 105 mph has an xBA around .700. A soft grounder at 75 mph has an xBA around .100. The actual batting average reflects what happened. The expected batting average reflects what the contact quality predicts. When they diverge, the divergence has a direction and a correction mechanism built into it.

A player hitting .198 on an xBA of .285 is having his hard contact land in gloves at an unsustainable rate. The divergence is temporary. The direction is upward. The signal exists in publicly available data and is systematically invisible to anyone watching batting averages, because batting averages don’t separate contact quality from luck. The xStats system surfaces the gap and translates it into adjusted fantasy value through a single formula.

The adjustment blends each player’s projected stat with the Statcast-derived expected stat using a weight that increases with sample size: w = sample / (sample + k). At zero plate appearances the projection holds entirely. As contact data accumulates, authority transfers from the projection to the observed expected stat. This is empirical Bayes shrinkage applied to within-season updating: the projection is the prior, the expected stat is the likelihood, and the stabilization constant k governs how quickly the posterior moves away from the prior. Different stats deserve different patience. A hitter’s expected batting average needs more plate appearances to stabilize than a pitcher’s strikeout rate, because batting average involves more moving parts. Each category gets its own k, calibrated against historical stabilization research. K-rate and walk rate trust contact data fast. Batting average and slugging trust it slowly. RBI fades the projection last.

The projection anchor dissolves at different rates per stat. Fast-stabilizing stats trust the contact data sooner.

All blending operates in rate space. Projections for counting stats are full-season totals: 30 HR projected across 600 PA. Statcast-derived values are rates: 0.050 HR per plate appearance. Blending 30 and 5 directly, projected home runs against actual home runs four weeks into the season, produces nonsense. Blending 0.050 and 0.063 in rate space, then converting back to counting stats using the projection’s playing-time assumption, produces signal. Rate stats like AVG, ERA, and WHIP blend directly. Counting stats convert to rates first, blend, then convert back. The projection’s playing-time estimate is preserved. A player projected for 600 PA who has 80 PA in April still gets valued on 600 PA of production, but the per-PA quality is adjusted by what the contact data shows.

Not every category has a direct Statcast expected equivalent. Baseball Savant publishes xBA, xSLG, xwOBA, and xERA, which map directly onto AVG, SLG, wOBA, and ERA. Those categories receive the strongest correction. But most fantasy scoring categories are counting stats with no direct expected counterpart. The metric derives expected counting rates from batted-ball quality data rather than falling back to noisy in-season actuals. Barrel rate, which measures the fraction of batted balls with exit velocity and launch angle in the zone that produces a .500+ batting average and 1.500+ slugging, is the strongest single predictor of home run rate. Roughly 55% of barrels leave the park in the modern era, and the conversion from barrel rate to expected HR per PA is approximately linear and empirically stable. Expected run and RBI rates derive from xwOBA through the standard wOBA-to-runs conversion, which is well-established in sabermetric literature and stable year to year. Expected WHIP derives from xBA-against and walk rate: expected hits allowed plus walks, divided by innings pitched estimated from batters faced. Each derivation introduces a layer of inference beyond the direct xStats, and the tool labels that distance.

The labeling uses three signal tiers. Strong-signal categories have event-level Statcast evidence with minimal inferential distance: AVG via xBA, SLG via xSLG, wOBA via xwOBA, ERA via xERA. WHIP is also strong-tier because it assembles directly from xBA-against and walk rate, both of which are event-level or fast-stabilizing inputs, even though no single “xWHIP” field exists on Savant. Moderate-signal categories use derived rates or Savant indicators that involve a conversion step: HR from barrel rate, runs and RBI from the xwOBA-to-runs conversion, and pitcher strikeout and walk rates from Savant’s K% and BB% data. Batter strikeout and walk rates are weak-tier because they carry no contact-quality information and stabilize slowly at the sample sizes where the metric operates. Three categories receive no adjustment at all: stolen bases, wins, and saves. These are excluded from the blending pipeline entirely and pass through at their projected values unchanged. Stolen bases depend on speed and opportunity with no contact-quality analog. Wins depend on team run support and role. Saves depend on closer assignment. Pulling these toward in-season actuals would introduce noise without adding signal, so the system does not attempt it. The tier is stamped on each player’s adjustment so the user can see whether a value correction rests on event-level evidence, a one-step derivation, or a slowly-stabilizing rate.

Each category’s correction is labeled by the quality of evidence behind it. The user sees what’s driving the adjustment.

The adjusted stats feed a parallel z-score and VORP computation that runs independently of the projection-based pipeline. This separation is a design constraint, not an implementation preference. If every player’s stats shift, some up and some down based on contact quality, the population distribution changes. A player whose adjusted AVG is .280, up from a projected .265, needs to be scored against a pool where every other player also shifted. Plugging adjusted values into the projection-derived population baseline produces z-scores on a mismatched scale. The parallel pipeline builds its own population mean and standard deviation per category from the full adjusted pool, computes z-scores against that adjusted baseline, then runs VORP with greedy positional assignment from the adjusted z-scores. The existing projection-based VORP is untouched. The meaningful output is the delta between the two: how much a player’s contact-quality-adjusted value differs from projection-only value. That delta is the correction signal.

Two independent pipelines with separate population baselines. The meaningful number is the gap between them.

A catcher whose adjusted z-scores shift by +0.5 is worth more than an outfielder with the same shift, because the replacement level at catcher is lower and positional scarcity amplifies the correction. The delta captures this. It speaks in the same currency as the rest of the pipeline: surplus value above replacement, in a league of your size, with your scoring categories. The adjustment is not a separate analytical layer sitting beside the main rankings. It flows through the same positional scarcity logic that governs everything else.

The blend handles the season-aggregate gap. A player whose contact quality is consistently stronger than his surface stats has a positive correction across every category that supports one. But a season-aggregate signal is stable and slow. A hitter who went cold three weeks ago still looks fine by his season line because 500 plate appearances absorb the recent downturn. Two adjacent systems extend the signal to shorter windows.

The first is the trend classifier. It computes rolling 7, 14, and 30-game windows over each player’s game log and compares the recent window to an expected baseline. A 14-game wOBA collapse paired with a season xwOBA that sits well above actual is classified as an unlucky cold streak. A 14-game surge paired with an xwOBA that sits below actual is classified as unsustainable. The matrix produces six labels across two dimensions and each label maps to a specific claim about what the numbers predict next. A daily server-side scan computes the windows for every qualified player and writes them to a single blob. The client loads the entire pool in one request rather than fetching 700+ game logs individually. The classifier itself is a shared module loaded by both the client and the snapshot function, so the labels assigned overnight match the labels the browser would compute live.

The second is windowed xStats. Savant publishes expected stats at season granularity only. There is no last-seven-games xwOBA on the leaderboard. A separate snapshot pipeline captures Savant’s expected_statistics leaderboard once a day and writes it to a date-keyed blob. Two snapshots reconstruct the xStat for the window between them by exact arithmetic: the later cumulative numerator minus the earlier, divided by the later cumulative denominator minus the earlier. The window boundaries are game-aligned: the tool finds the date of the Nth-most-recent game in the player’s game log and fetches the snapshot from the day before that game, so the delta covers the same N games as the counting-stat rolling window. xBA and xSLG need at-bats as their denominator, which the leaderboard doesn’t publish, so a parallel call to a custom-leaderboard endpoint supplies them. xwOBA uses plate appearances directly. The reconstructed values appear in the player modal alongside the existing rolling-window counting stats. A reader looking at a hitter whose last-seven-games wOBA has collapsed sees the last-seven-games xwOBA directly below it, and the underlying contact-quality metrics two rows further down. The comparison is at matched window length. Whether the contact quality collapsed with the results or held steady while they fell is visible in the same column.

The trend classifier uses window-matched expected stats when the snapshot history supports it and falls back to the season aggregate when it doesn’t. The same daily snapshot pipeline that powers the windowed reconstruction also stores raw projection state alongside it, so the season’s adjusted values can be validated retrospectively against end-of-season outcomes. Projections are mutable documents that the sources overwrite as the season progresses. A day without a snapshot loses that day’s projection state permanently.

The limits are stated in the output. Expected statistics model contact quality. They don’t model pitch selection, lineup construction, park factors that vary across a split, or deliberate approach changes where the contact profile has shifted but the historical comparison period hasn’t caught up. A player flagged as undervalued by contact quality may have restructured his swing, making his pre-change and post-change profiles different in ways the model can’t see. The signal is a starting point for investigation, not a conclusion. The counting-stat derivations, barrel rate to home runs and xwOBA to runs, add a conversion layer the direct xStats don’t require. The derivation coefficients are empirical approximations calibrated against historical data, not physical constants, and the system labels which corrections rest on event-level evidence and which rest on derived rates with wider uncertainty bands.

The full derivation, including every per-category formula, every signal-tier mapping, every constraint on when a window value is shown, and the per-stat stabilization constants, lives on the xStats methodology page.

The Football Signal Layer

Football has no public Statcast. The NFL’s Next Gen Stats system tracks player tracking data at a similar event-level resolution to what Baseball Savant publishes, but the tracking data itself is not released. What reaches the public is a small set of aggregated metrics on an official page, not a downloadable leaderboard keyed to raw tracking inputs. There is no xBA for receivers, no xwOBA for running backs, no event-level model of expected fantasy points per touch that the tool can fetch, blend, and correct against. The absence is not a gap the tool has been slow to fill. It is a structural feature of what is publicly available, and an honest football signal layer has to be built on different foundations than the baseball one.

The football signal layer is opportunity, not expected outcome. The nflverse project publishes weekly aggregated player stats, derived from public play-by-play data, that expose the shape of a player’s role on his offense rather than the luck-adjusted quality of his touches. The four metrics that do the most work are target share (fraction of team passing attempts directed at a receiver), air yards share (fraction of team passing depth directed at a receiver), weighted opportunity rating (a combined score that treats a deep target as more valuable than a screen), and carry share (fraction of team rushing attempts given to a back). These are role measurements. They tell you what the team is giving the player, not what the player is doing with what he gets. For fantasy purposes this distinction matters more than it first appears. A receiver with a 28% target share on a bad passing offense is a better bet than a receiver with 4 catches for 90 yards on a 12% target share, because the 28% is repeatable and the 90-yard game is not. The tool fetches the nflverse weekly stats through a dedicated serverless function, aggregates them per player, and stamps the results onto each player entry.

Two derived signals extend the role measurement in the direction of change. Each opportunity metric is computed over the full available window and also over the most recent three games, and the comparison between the two produces a trend direction per player. A running back whose full-season carry share is 52% but whose last-three-week carry share is 31% has lost snaps, and the direction is visible before the box score catches up. A receiver whose trailing-three target share has climbed above his season average is being featured more recently, and the fantasy value is moving before the ranking services have adjusted. Expected points added per dropback, per carry, and per target are sourced in the same feed and give a coarse efficiency overlay on the opportunity numbers. The overlay is deliberately coarse. EPA is noisier than target share at short samples, and the labels say so.

Dynasty and trade value for football come from two external market-derived sources rather than from an age-curve derivation. KeepTradeCut publishes a crowdsourced dynasty value by running a continuous would-you-rather polling system across thousands of fantasy players. FantasyCalc publishes a similar market value derived from observed trade frequencies and league activity on connected platforms. The tool proxies both, caches the results in shared edge storage, and blends them into the composite for dynasty and trade-analysis views. The reason football uses a market-derived dynasty signal rather than an aging-curve projection is that football careers are shorter, more variance-driven, and more sensitive to coaching and depth chart shifts than baseball careers. A baseball age curve is a reliable population prior. A football age curve is a weaker prior that a live market routinely beats on individual players. The tool follows the signal that is stronger in each sport rather than applying the same method to both.

What the football signal layer does not do is produce an adjusted value delta comparable to the baseball xStats pipeline. There is no parallel z-score pass running on contact-quality-adjusted statistics, because there are no contact-quality-adjusted statistics to run it on. Opportunity metrics enrich the composite and drive a separate set of insight cards (role changes, target share climbers, carry share losers), but they don’t re-run VORP against a second population baseline. The football composite is a projection-based composite enriched with opportunity context and market-derived dynasty value. The baseball composite is a projection-based composite plus an independent adjusted-value pipeline built on event-level expected statistics. Each sport gets the analytical depth its public data supports.

The Insights Engine

The insights system generates observation cards from live league data. It isn’t a recommendation engine. That distinction shapes everything about how it’s built.

A recommendation engine takes data and produces a directive: start this player, drop that one, make this trade. It resolves uncertainty on the user’s behalf by presenting conclusions as actionable. An observation engine surfaces what the data shows and leaves the inference to the user. The difference is in what each tool claims to know. A recommendation engine claims to know enough to tell you what to do. An observation engine claims only to see what’s in the data and to report it clearly.

Insight cards are confidence-tiered. High-confidence observations are drawn from hard data with minimal inferential distance: your current matchup’s W-L-T record by category, players whose surface stats diverge from expected stats by a statistically significant margin, rosters with clear positional imbalances. Medium-confidence observations involve one or two inferential steps and carry explicit uncertainty markers. Low-confidence observations are labeled as speculative with a specific basis described.

Every insight card that makes a directional claim also specifies what the claim can’t account for. The matchup category tracker shows current margins and flags categories close enough to potentially flip. But the tool is explicit that closeness in a category isn’t the same as flippability. Flippability depends on remaining games, bench depth, opponent lineup decisions, and category volatility across the scoring period. None of those factors are modeled. The insight says: this category has a small margin. It doesn’t say you can flip it by starting a specific pitcher tonight. That second inference requires information the tool doesn’t have, and the tool says so rather than supplying a confident answer from incomplete inputs.

This design makes the tool less immediately satisfying. Users who want to be told what to do will find it frustrating, and that is a real cost. The tradeoff is that when the tool does make a claim, the claim is honest about its basis. A tool that always has an answer is often wrong in quiet ways. A tool that knows its limits builds a different kind of trust.

Technical Architecture

The tool is vanilla JavaScript with no framework dependencies. Four platform adapters, a sport module per supported sport, roughly two dozen serverless functions, roughly fifty client-side modules, and a web worker for the composite pipeline. The decision follows from the requirements.

React and Vue solve problems this codebase doesn’t have. They manage component state isolation across large applications with many collaborating developers. They solve the re-render coordination problem at scale. This application has one developer, a documented global state object S, and performance requirements that a virtual DOM reconciliation layer would work against. Every framework adds a bundle weight and an abstraction cost. The bundle delays initial parse time. The abstraction sits between the code and the DOM, adding overhead to every state change while the framework decides what needs to update. For an application that needs to build a composite ranking pipeline and render it to a table as fast as possible, those costs aren’t negligible and the problems being solved aren’t present.

The architecture substitutes framework patterns with direct decisions. State is a single documented global, S, with typed fields and a generational counter that increments on meaningful state changes and invalidates all derived caches. The normalized league data lives at S.league and is the canonical surface for shared code. ESPN’s preseason settings response is sometimes incomplete, so the raw ESPN payload is held under S._rawEspn for the settings-retry path that re-runs the adapter against an updated response; that field is read only by ESPN-specific code. Component isolation is tab-scoped rendering: each tab has its own render function that produces an HTML string and sets innerHTML once. One DOM write per render. No diffing. No reconciliation. User interaction is handled by a single event listener that reads data-action attributes and dispatches to named handler functions. One listener serves the entire application. New interactive elements can be added anywhere without listener registration.

The adapter pattern enforces a strict boundary between platform-specific and platform-agnostic code. Each adapter is a pure function: raw API response in, normalized LeagueData out. No side effects, no state reads, no DOM access. The normalized output provides every field that shared code needs: player positions as string arrays, lineup slots as position strings, pro team abbreviations, draft ranks as flat integers, ownership as a flat percentage, date of birth for dynasty computation. Shared accessor functions read these normalized fields without fallback chains or platform conditionals. Position resolution, slot classification, ownership lookup, and player name extraction all operate on the same field names regardless of which platform produced the data.

The sport module is the seam that lets the same analytics layer operate on two games without either game leaking into the other. Each sport module publishes the same interface: the list of positions, the set of pitching or non-offensive positions, the canonical stat namespace, the mapping from stat IDs to display labels, the default roster construction, the fantasy point derivation, the season length, and the aging curve where one exists. sport-baseball.js publishes these for MLB. sport-football.js publishes them for NFL, keyed on Sleeper’s canonical stat vocabulary so the football codepaths speak one language regardless of which platform the user connected. The analytics pipeline reads through the sport module. The renderers read its category labels. The adapters translate platform-specific stat identifiers into the sport module’s namespace before handing off. Adding a scoring category to a format is a one-file change. Adding a new sport is a new module and adapter extensions. The boundary protects the analytics layer from sport-specific assumptions leaking into supposedly neutral code, and it is why the composite pipeline, z-score machinery, VORP calculation, tier clustering, and empirical Bayes shrinkage all run unchanged on either game.

The analytics pipeline runs once per state change and produces a sorted composite array that every renderer treats as immutable. Z-scores, VORP, tier clustering, disagreement metrics, and empirical Bayes shrinkage run in a single pass over the player pool. When the pool is large enough that the main thread would block during the pass, the work runs in a web worker. The worker loads the same analytics module via importScripts, receives the composite array and configuration over postMessage, computes the full set of z-scores and replacement levels, and returns the result without touching the DOM. Browser globals that the module references but doesn’t call during the pass are stubbed inside the worker. The main thread stays responsive during recomputation. For the baseball pipeline, a parallel adjusted pass runs independently when Statcast data is available: blended stats, recomputed population baselines, adjusted z-scores, and adjusted VORP, stamped back onto the composite entries as additive fields without mutating the projection-based values. Renderers read the composite array and produce output. They don’t run analysis. The analytics pipeline doesn’t touch the DOM. The separation keeps both sides simple and testable in isolation.

Some logic has to run in two places. The trend classifier executes inside the browser when a user opens a player modal and inside a scheduled server function during the daily scan. Both runtimes load the same shared module at startup, with the same thresholds and wOBA weights sourced from the same constants. Drift between the two implementations becomes structurally impossible rather than discouraged.

Caching is layered at multiple levels. In-memory caches handle expensive derived calculations within a session: scarcity tables, roster composition, VORP baselines. Browser localStorage handles persistence across sessions, keyed by league ID, season, and data generation counter to prevent stale reads. Edge caching on the server layer handles shared data: consensus rankings, league schedules, and ADP sources don’t vary per user and don’t need to be fetched individually. Each cache TTL is tuned to the update frequency of the data it holds. Completed-week boxscores are cached permanently. Current-week data is cached for five minutes. Live game scoreboard data expires in sixty seconds. The effect is that most interactions hit local memory rather than the network, and the network requests that do fire are often served from a cache rather than a cold origin call.

Most interactions hit local memory. Network requests hit edge caches. Cold origin calls are rare.

A thin server layer handles API proxying and security headers. ESPN, Yahoo, and Fantrax require server-side requests because the browser cannot reach them directly without CORS violations or leaked credentials. Sleeper’s API is public and CORS-friendly, but its full player database is a twenty-megabyte response that would waste bandwidth if every client fetched it independently, so the Sleeper proxy exists for caching rather than for access. The proxy layer is mostly stateless, adds consistent security headers, and returns responses with appropriate cache controls. The one warm-state exception is a per-IP burst counter on the ESPN kona_player_info path, which lives on the warm Lambda for the lifetime of the instance to absorb client retry storms; the limit is per-instance rather than fleet-wide. Nothing else persists across requests, and there is no session context to manage.

Scheduled functions invert the otherwise reactive pattern. The baseball pipeline runs two. The trend scan fetches game logs for every qualified player in the Statcast pool daily during baseball season, computes rolling windows and trend classifications, and writes the results to a shared blob. The xStats snapshot captures Savant’s expected statistics leaderboard each morning alongside the current projection cache, written to a date-keyed blob so later reads can difference two snapshots into a window-matched expected stat. The football pipeline runs a parallel weekly snapshot. The nflverse advanced rollup is a season-to-date cumulative view; differencing successive weekly snapshots reconstructs per-week deltas and rolling-window views of opportunity and efficiency that the upstream source does not retain. One blob per season week, keyed by week number. All three inversions exist for the same reason: the alternative is hundreds of individual fetches per user session, and the work doesn’t depend on which user is asking. The server does it once. Every client session benefits. The reactive per-player paths remain as fallbacks, so nothing breaks if a blob is missing.

Demo mode exists because both sports have long offseasons and the platform APIs only return rich data while the season is live. A tool that touches matchups, weekly boxscores, scoring period stats, transactions, and roster moves needs all of those populated to develop against, and they aren’t between November and March (baseball) or February and August (football). The demo synthesizes a complete league state from the current player pool: a 10-team snake draft fills a standard Superflex football lineup or a standard category-league baseball lineup, mid-season matchup results accumulate, category wins derive from actual roster VORP, transactions fire on a realistic cadence, and live game states match the active sport. A dynasty checkbox at launch activates deeper rosters, age-adjusted valuations, and the Farm or Rookies tab.

The architectural point of the demo is that it builds its league data in the same normalized shape that the four platform adapters produce. Teams carry normalized roster entries with slot strings and player objects. The schedule, draft picks, and metadata all match the adapter output shape exactly. Every analytics pass and every renderer receives data indistinguishable from a live connection. When a feature works in demo mode, it works in production. When it breaks in demo mode, the bug is real. The demo isn’t a stub or a fixture; it’s a full simulation that runs through every code path the production tool runs through, which is what makes it usable for development through both offseasons.

Mobile is a first-class layout, not a scaled-down desktop. The breakpoint response is compression and reordering rather than feature removal. Tab bars overflow horizontally with fade-mask indicators instead of hiding tabs behind a disclosure menu. Toolbars reorder so control panels open directly beneath the toggles that govern them. Dense tables compress into single-row layouts instead of stacking into multi-row entries that break scanning. Dark mode propagates through every surface that renders, including exported images and icon glyphs. The analytical depth on a phone is the same analytical depth on a laptop, read through a different layout.

A connected league is not required to use the tool. FantasyPros consensus projections and the supporting ranking sources load on first paint and drive z-scores, VORP tiers, and composite ranks against the full draftable pool. A sport toggle flips the rankings between baseball and football, triggering the appropriate source reloads. Roster-dependent surfaces (Rosters, Matchups, Trades, Keepers) hide themselves in this mode rather than rendering with empty context. Connecting a league enables those surfaces; the rankings themselves were always free.

Performance

There is no build step.

No transpilation. No bundling. No minification pass. No webpack, no Vite, no esbuild. The JavaScript that ships to the browser is the JavaScript that was written, character for character. The browser’s parser receives it directly.

This is not a philosophical stance. It’s a performance decision with measurable consequences. A build step exists to solve problems that this codebase doesn’t have: code splitting across lazy-loaded routes, JSX compilation, TypeScript erasure, polyfill injection for older browsers. Each of those transformations adds indirection. The bundler rewrites import paths. The minifier renames variables. The source map generator builds a parallel representation so you can debug the code you wrote instead of the code that shipped. The result is a toolchain that sits between you and the thing you built, adding latency at deploy time and opacity at debug time.

Without a build step, there is nothing to configure, nothing to cache-invalidate, nothing that can silently transform your code into something that behaves differently than what you tested. When something breaks, the stack trace points at the actual line in the actual file. When something is slow, the profiler shows the actual function. There is no gap between the code and the truth about the code. The toolchain you would otherwise spend hours debugging at midnight does not exist to debug.

The analytics pipeline computes z-scores, VORP, tier clustering, empirical Bayes shrinkage, and positional scarcity for several hundred players in a single pass. The pass runs in a web worker when the pool is large enough that a main-thread synchronous pass would produce visible jank, and on the main thread otherwise. Either path produces the same result stamped onto the composite array. That result is stable until the next data change triggers a full recalculation. Every renderer downstream reads it without recomputation.

First contentful paint happens before the network requests for league data have returned, because the initial render path (HTML parse, CSS apply, font load, first table render from cached composite data) has no framework initialization overhead competing for the main thread. Tab switches are instant: each tab’s content is generated as an HTML string and written to the DOM in a single innerHTML assignment. No virtual DOM diff. No reconciliation. One write, one paint.

Network requests are structured so that the user never waits for data they don’t need yet. The boot sequence loads consensus rankings first, renders a usable table, then fetches league data, then loads supplemental projections, then populates the free agent pool. Each phase renders its results as they arrive rather than blocking on the slowest request. The matchup scoreboard lazy-loads player stats only when the user opens a specific matchup. Statcast data loads in the background after the composite pipeline completes, populating the adjusted value column without blocking the initial render. The tool never fetches data speculatively. Every request is triggered by a user action or a visible render path.

Each phase renders as it arrives. The user has a working table before the slowest request finishes.

All visual transitions and animations are CSS-only and compositor-thread-accelerated, decoupled entirely from JavaScript execution.

The Editorial Layer

The tool produces analytical output. Field Notes writes about it. The product without the editorial layer is a set of dashboards, useful but mute about what to look at first. The editorial layer without the product is opinion. The pairing closes the loop: the tool produces the data, the editorial layer demonstrates the read, the reader can then perform the same read against their own league’s data using the same tool. The piece teaches a method by walking through a case. The reader extracts the method, not the conclusion.

Pieces are short and structured around a single observation. Baseball pieces work from Statcast and contact quality. Football pieces work from opportunity and role signals. The recurring shapes are a Liar (a player whose surface stats overstate the underlying contact quality), a Truther (a player whose contact quality confirms or complicates the surface line), a Portrait (a fuller picture built from multiple inputs), and a Mechanism (a piece that walks through how a particular signal is constructed and what it shows when applied).

The editorial layer is not promotion for the tool. It’s an additional surface where the same epistemological commitments apply. If the editorial layer ever advocated for a position the tool couldn’t support, something would have to break first.

The Diagnostic Layer

The tool runs a substantial amount of computation in a web worker thread. Composite ranking, source normalization, projection scaling, VORP calculation, and aging-curve projection all execute off the main thread to keep the UI responsive. The cost of that decision is that the worker thread cannot write to the browser console; debug output from inside the pipeline is invisible to anyone trying to diagnose a misbehaving rank.

The diagnostic layer solves this by stashing intermediate state on the global state object during normal renders. The composite pipeline records its source-scale detector reads on S._lastSourceScales: one entry per imported projection feed, capturing the top-five FPTS mean and the action taken (cached, normalized, weekly-passthrough, or skipped). None of these are surfaced in the UI. They exist purely so that the next step after “this projection looks wrong” is opening the console and reading the breakdown.

The source-scale detector is the canonical example of why this matters. NFL projection feeds arrive at three different scales: weekly, per-game, or full-season. A K or DST projection at a season scale (around 150 fpts) looks indistinguishable from a star skill player projection at a weekly scale (around 25 fpts) until you compare against the rest of that source’s players. The detector reads the top-five mean of each source and flags anything that exceeds an expected weekly ceiling. The threshold is computed dynamically from the league’s scoring weights and floored at 30 fpts: an original fixed cut at 100 missed K and DST projections that landed in the 70-90 zone, the dynamic floor at 30 catches them while the per-league ceiling stays clear of the legitimate weekly maximum for top skill positions. The diagnostic stash records the top-five mean per source so a future divergence is one console read away from a root cause.

Defensive caps run at three layers. Producer-side analytics applies a cap when finalizing the cp.proj field. Power Rankings applies an independent consumer-side cap when reading projections to compute team totals. Positional Strength applies the same cap when filling its heatmap cells. The redundancy is deliberate. Single-source-of-truth normalization works when every consumer reads from the same canonical field; in practice, several consumers read fallback chains, and a value that escapes the producer cap can still leak through if any consumer skips the normalized field. Catching the leak at every layer is cheaper than guaranteeing every consumer reads correctly.

This pattern, audit infrastructure exposed at boundaries rather than concealed in logs, follows the same principle that governs the editorial voice: show the derivation. A claim is only trustworthy if a skeptical reader can trace it to its inputs. The same applies to a number rendered in a Power Rankings cell. The user shouldn’t need to trust the rank. They should be able to read the breakdown that produced it.

Cross-Sport Architecture

The sport module pattern from Technical Architecture handles the obvious sport-specific behavior: positions, stat namespaces, fantasy point derivation. What it also carries are the projection accessors and the calibration constants that govern derived analytics. NFL reads starterProj as the canonical projection field; MLB reads proj. NFL multiplies weekly projections by 17 to derive a season scale where comparisons need it; MLB’s native projections are already season-scale and the multiplier is 1. The per-pick draft grade scale uses pickGradeScaleCoef and pickGradeScaleFloor, which differ by an order of magnitude across sports because the player pool and the natural rank variance differ by an order of magnitude. The team-grade aggregation uses gradeAggregateTrimPct to drop the worst N% of pick grades before averaging, so the body of a draft defines its grade rather than its tail. NFL trims 10%, MLB trims 15%. Each constant was tuned independently against real drafts in each sport, not picked as a compromise that fits neither.

Below the sport modules, the four platform adapters from the Data Layer feed projections in at different native scales: weekly fpts, per-game averages, full-season totals. Per-platform divisor normalization happens at the projection-import boundary in analytics.js, where each source’s native scale is detected and converted to the pipeline’s working scale before any composite math runs. The downstream pipeline never sees raw platform projections.

The sharpest architectural asymmetry between sports is the historical ADP layer. NFL has FantasyPros’ archived ECR snapshots going back several years; a draft from 2023 can be graded against the ADP that was actually live the day of the draft. MLB has no equivalent archive. Team Report Cards surface this directly: NFL cards show two grades (Day and Hindsight) plus a Value Added column; MLB cards show only Hindsight with a footnote explaining that day-of-draft ADP isn’t available for this sport. A degenerate-Day detector confirms the absence at runtime by checking whether Day GPA equals Hindsight GPA across all teams within a small tolerance, then suppresses the Day column when they match.

The architecture treats the sports as siblings, not as a primary case and a special case. New analytical features ship for both unless the underlying data isn’t available, in which case the feature ships for the supported sport and surfaces its own absence on the other. The user is told what isn’t there, not silently shown less.

What Isn’t Modeled

Expected statistics don’t account for deliberate approach changes. A hitter who restructures his swing to generate more lift will produce exit velocity and launch angle profiles that differ from his historical baseline. If the improvement is real but the expected statistics are being compared against a historical mean that predates the change, the system will flag the gap as undervalued contact quality when what it’s actually seeing is a new baseline. The signal isn’t wrong, the contact quality is what it is, but the direction of causation isn’t visible to the model.

The counting stat derivation coefficients used in the adjusted value computation are empirical approximations, not physical laws. The barrel-to-HR conversion rate of approximately 55% varies by park factor, player speed, and year-to-year league environment. The xwOBA-to-runs constants assume a league average wOBA near .315 and a wOBA scale factor near 1.245, both of which are stable but not fixed. RBI derivation is the weakest link: a player’s expected RBI rate depends on teammate on-base rates and lineup position, which the model does not attempt to capture. It preserves the projection’s lineup-context ratio and adjusts the underlying offensive quality, which is the best available approximation that avoids introducing team-context noise. The coefficients are validated against end-of-season data and recalibrated where the error exceeds 5%.

Cross-platform projection sourcing isn’t implemented. A connected Yahoo league uses Yahoo’s rankings as the platform source. A connected ESPN league uses ESPN’s. Fantrax uses real ADP when a session cookie is available and falls back to FantasyPros ECR as the sole ranking authority when it isn’t. Sleeper has no meaningful native ranking field in its feed and is treated like Fantrax in the fallback case: FantasyPros ECR is the ranking authority, supplemented by KeepTradeCut and FantasyCalc for dynasty and trade value. There’s no mechanism to load ESPN projections into a Yahoo league or vice versa. FanGraphs imports serve as the primary analytical supplement for baseball and cover most of the use case, but the connected platform’s rankings are always the platform source when the platform provides them.

The insights engine doesn’t model schedule remaining. It sees what has happened in the current scoring period. It doesn’t project what will happen based on remaining game counts, opponent strength, weather, or lineup decisions. This is a tractable problem. The current judgment is that a clearly-labeled, honest partial picture is more useful than a confidently-stated incomplete one.

The platform APIs that power the connection layer can change without notice. A working feature today can break tomorrow through no fault of the tool. This is the permanent condition of building on top of third-party platforms, and the only honest answer to it is to keep watching for the breakage and fix it when it happens. The alternative is to not build the tool.

Dynasty aging curves are population-level averages, not individual projections. A 32-year-old might defy the curve for two more years or fall off a cliff tomorrow. The multiplier doesn’t know which. It provides a reasonable prior for valuation, not a forecast. Prospect rankings, similarly, are editorial opinions published by MLB Pipeline, not statistical projections. A top-ranked prospect fails to reach the majors at a nontrivial rate. The tool surfaces the ranking and the age. The user supplies the judgment.

That’s what this is. A ranked list that doesn’t pretend to be more certain than its inputs, built on APIs it doesn’t control, distributed for free to whoever finds it useful. The architecture follows from the analysis. The analysis follows from the premise. The premise is that uncertainty, shown clearly, is more useful than certainty, manufactured.

The list was always fiction. This is an attempt at something more honest.