Colophon - League Donation

The ranked list is a kind of fiction.

That isn’t a complaint. It’s a design observation. When a fantasy tool presents a numbered list from player 1 to player 300, it’s encoding a claim of precision that the underlying data doesn’t support. The sources disagree. The projections diverge. The ADP reflects market sentiment that may or may not track real value. Averaging those inputs and presenting the result as a settled order resolves the uncertainty on the user’s behalf, and that resolution is where the signal goes to die.

League Donation starts from a different premise: the uncertainty is the information.

When FantasyPros ECR, FanGraphs Steamer, and the connected platform’s internal rankings converge on a player at 35th, the consensus is probably right. When they diverge by 40 positions, that divergence tells you something the composite doesn’t. A player ranked 35th by consensus and 75th by Statcast-based projections is a question worth asking. A player ranked 35th by consensus but going in the 80s in NFBC drafts is a market inefficiency. The tool was built to surface those gaps rather than collapse them into a clean number that papers over what the sources are actually saying.

Everything downstream follows from that premise.

Philosophy

The tool is free. That isn’t an incidental feature. It’s consequential to the integrity of what the tool does.

A monetized analytics tool has a structural conflict it can’t fully escape. Premium tiers create pressure to withhold the most useful features. Advertising creates implicit incentives to keep users engaged rather than informed. Affiliate relationships can subtly tilt recommendations in directions that are commercially convenient. None of this requires conscious intent. It’s embedded in the incentive structure. The tool avoids these pressures by having no revenue model. The analysis is accountable only to whether it’s right. That’s the only constraint worth having.

The Data Layer

ESPN and Yahoo use different data structures, different player ID systems, different roster representations, and different scoring category schemas. A tool that works across both can maintain two parallel analytical codebases or define a common schema and write adapters into it.

The adapter approach is more expensive to build and dramatically easier to maintain. Each platform’s raw API response flows through its adapter and emerges as a normalized LeagueData object with a consistent shape: league settings, team rosters, schedule, draft history, scoring categories, and roster slot counts. The analytics layer never touches the raw platform response. It only reads the normalized output.

This separates the data translation problem from the analysis problem. When Yahoo changes its API response format, one function changes. When a new platform is added, a new adapter is written and the entire pipeline inherits it without modification. The normalization logic is also auditable in isolation, which matters for verifying that the tool is reading the data correctly.

ESPN’s API presents specific extraction challenges. Roster settings are sometimes incomplete in preseason, returning category totals without individual slot counts or omitting positions with zero players assigned. The tool handles this through a two-pass fetch: initial settings from the primary endpoint, followed by a secondary fetch from the boxscore endpoint that often carries fuller data. When the secondary response contains better roster or scoring information, it triggers renormalization through the adapter. The normalized schema is always downstream of the freshest available data, and the re-normalization step is explicit and auditable rather than buried in conditional logic.

Yahoo’s OAuth flow introduces a lifecycle problem the ESPN connection doesn’t have. Access tokens expire after one hour. The tool stores both the short-lived access token and the longer-lived refresh token in the browser’s local storage, monitors token age against a 30-day expiry window, and refreshes the access token silently before expiration using the refresh token. The connection appears persistent to the user because the token mechanics are handled entirely in the background.

Both adapters stamp a platform field on the normalized output. Every function downstream that needs to distinguish ESPN behavior from Yahoo behavior reads S.league.platform rather than maintaining its own platform detection. The source of truth is in one place.

The Composite Ranking Pipeline

A source is any input that assigns ranked positions or projected statistics to players. FantasyPros ECR is a source. A FanGraphs Steamer CSV is a source. The connected platform’s internal rankings are a source. NFBC ADP is a source. Each measures something slightly different.

ECR is aggregated expert opinion, a consensus of analysts who have thought carefully about how the season will go, adjusted for positional context and draft format. Projection systems like Steamer are statistical models built on historical performance, aging curves, park factors, and regression to mean. They know nothing about narrative or reputation. ADP is a market price, what actual participants in real drafts have been paying for players. It carries behavioral patterns, recency bias, league-size effects, and the accumulated heuristics of the fantasy-playing population. These three things diverge for structural reasons. Experts may underweight regression toward the mean on breakout candidates. Projection models don’t capture role uncertainty or injury history that isn’t yet in the stats. ADP reflects what the market believes, which is often a lagged version of what the analysis shows.

The composite pipeline takes all active sources, weights them according to their sourceType and user configuration, and builds a composite ranking. But the visible output isn’t only the composite. Every player’s record carries each source’s rank alongside the composite rank so the user can see where the sources agree and where they don’t. A player with tightly clustered source ranks is well-priced by consensus. A player with a 40-position spread between sources is a live question. The spread is the story, and hiding it inside an averaged number defeats the purpose.

Source weighting is configurable. ECR defaults to higher weight because expert consensus tends to outperform individual models at the tail end of rankings. A user who has strong views about a specific projection system can upweight it and the composite responds immediately. Exposing the weighting as a choice requires admitting that the output isn’t objective. The tool doesn’t pretend otherwise.

The boot sequence is deliberately tiered. On initial load, the tool fetches FantasyPros ECR before anything else, builds a composite from that single source, and renders a working ranked list. The user has a functional tool within seconds. The platform connection, the extended player pool fetch, and supplemental data requests all happen asynchronously behind a UI that’s already interactive. The slow work happens behind a fast surface.

Demo mode exists because the MLB offseason is five months long and the APIs are live year-round. Building a tool that connects to ESPN and Yahoo means working with data structures that are only fully populated during the season: live matchups, weekly boxscores, scoring period stats, transaction feeds, roster moves. Without a way to generate realistic league state on demand, development stalls from November to March. The demo was built as a data modeling environment first and a user-facing feature second.

A 10-team snake draft is simulated from the current player pool, producing realistic rosters with keeper assignments and a full draft log. Mid-season mode generates eleven weeks of matchup results with category wins derived from actual roster VORP, realistic player stat lines, simulated transactions, and live MLB game states. The demo uses the same normalization pipeline as a live league connection. Every adapter, every analytics pass, every renderer receives data in the same shape it would from ESPN or Yahoo. When a feature works in demo mode, it works in production. When it breaks in demo mode, the bug is real. The demo isn’t a simplified preview of the tool. It’s the tool running against synthetic data that exercises the same code paths as live data.

Statistical Methodology

Z-scores normalize player projections across scoring categories by measuring each player’s projected contribution as a deviation from the position-eligible player mean, in units of standard deviation. The normalization is necessary because raw counting stats aren’t comparable across categories. A projection of 30 home runs and a projection of 40 stolen bases can’t be directly summed into a value number without first converting them to a common scale.

The critical implementation detail is that z-scores are calibrated to your league’s draftable pool, not to a global population. The mean and standard deviation for each category are calculated from the set of players who would realistically be drafted in a league of your size and format. Standard z-score implementations often use fixed sample sizes or global pools that don’t reflect your actual draft context. Recalculating from the actual draftable pool means that positional depth, team count, and roster construction all influence the underlying distribution. A 12-team league and an 8-team league playing identical categories have different z-score distributions because the relevant player pool is different.

VORP extends z-score analysis by establishing a replacement-level baseline for each position. The replacement player is identified as the last positionally-eligible player who would realistically be drafted in a league of your size, the best player available after the draft ends. Every player’s value is then measured as surplus above that floor rather than above zero or above the mean.

The positional scarcity effect becomes visible here. A catcher with average offensive production may carry a VORP that exceeds an above-average outfielder’s, because the replacement level at catcher is significantly lower than at outfield. The scarcity is real. The depth difference is real. Rankings that ignore positional context will systematically undervalue scarce positions and overvalue deep ones, and the error compounds through the draft as positional runs develop. VORP handles this correctly by measuring each player against the actual alternative available at their position.

Tier clustering groups players into bands where the projected gap within a tier is smaller than the uncertainty in the projections themselves. The practical implication: worrying about player 12 versus player 15 within the same tier is a category error, because the projections that produced those ranks don’t support that level of precision. The tier boundary is where discrimination actually becomes meaningful. Two players in adjacent tiers are genuinely different by what the data can see. Two players within the same tier are not, and treating them as different produces decisions whose confidence exceeds their evidentiary basis.

Empirical Bayes shrinkage addresses the projection reliability problem for players with limited track records. A 23-year-old with 200 career plate appearances might project for a .340 batting average based on a small sample that happened to go well. The naive projection carries far more uncertainty than its precision implies. Shrinkage pulls extreme projections toward the population mean, with the magnitude of pull inversely proportional to sample size. Players with extensive major-league history are pulled only slightly. Players with limited history are pulled substantially. The resulting projections are less impressive but more defensible, and they reduce the drafting errors that come from treating small-sample performance as established quality.

The VORP calculation uses a greedy positional assignment algorithm to handle players with positional eligibility at multiple positions. Rather than assigning each player to their primary position, the algorithm maximizes total VORP across the roster by considering the full scarcity landscape. A player eligible at both second base and shortstop is assigned to whichever position maximizes the team’s collective surplus above replacement. This produces better draft recommendations than naive primary-position assignment, particularly in leagues with unusual positional depth distributions.

Statcast and Expected Statistics

Batting average depends on two things: contact quality and whether that contact found fielders. The second variable is close to random. Balls in play drop for hits or find gloves based on defensive positioning, spray direction, park geometry, and outcomes that aren’t repeatable in any meaningful sense. Over a season, the noise largely cancels. Over a week, or a month, it doesn’t.

Statcast tracks exit velocity and launch angle on every batted ball. From those inputs, expected batting average is calculated as the historical hit rate on balls with similar velocity and angle profiles, independent of where any specific ball actually landed. A line drive at 105 mph has an xBA around .700. A soft grounder at 75 mph has an xBA around .100. The player’s actual batting average reflects what happened. The expected batting average reflects what the contact quality predicts. When they diverge, the divergence has a direction and a correction mechanism built into it.

A player hitting .198 on an xBA of .285 is having his hard contact land in gloves at an unsustainable rate. The divergence is temporary. The direction is upward. This is a buy signal that exists in publicly available data and is systematically invisible to people watching batting averages because batting averages don’t separate contact quality from luck. The xStats section automates the gap identification and ranks players by the magnitude and consistency of their expected-versus-actual divergence.

The regression score integrates multiple expected-versus-actual comparisons into a single directional signal: xBA against BA, xSLG against SLG, xwOBA against wOBA, alongside barrel rate and hard contact rate. A player whose contact quality metrics are consistently stronger than his surface stats across all dimensions has a positive regression score. The signal isn’t claiming the player is a good player or that his contact quality will remain stable. It’s claiming that his current surface stats are an unreliable representation of his recent batted-ball quality, and that this unreliability has a predictable correction direction.

The limits are explicit in the output. Expected statistics model contact quality. They don’t model pitch selection, lineup construction, park factors that change across a split, or deliberate approach changes where the contact quality profile has shifted but the historical comparison period hasn’t yet caught up. A player with a high regression score may have recently restructured his swing, making his pre-change and post-change contact profiles genuinely different. The signal is a starting point for investigation, not a conclusion. The tool flags this rather than smoothing over it.

The Insights Engine

The insights system generates observation cards from live league data. It isn’t a recommendation engine. That distinction shapes everything about how it’s built.

A recommendation engine takes data and produces a directive: start this player, drop that one, make this trade. It resolves uncertainty on the user’s behalf by presenting conclusions as actionable. An observation engine surfaces what the data shows and leaves the inference to the user. The difference is in what each tool claims to know. A recommendation engine claims to know enough to tell you what to do. An observation engine claims only to see what’s in the data and to report it clearly.

Insight cards are confidence-tiered. High-confidence observations are drawn from hard data with minimal inferential distance: your current matchup’s W-L-T record by category, players whose surface stats diverge from expected stats by a statistically significant margin, rosters with clear positional imbalances. Medium-confidence observations involve one or two inferential steps and carry explicit uncertainty markers. Low-confidence observations are labeled as speculative with a specific basis described.

Every insight card that makes a directional claim also specifies what the claim can’t account for. The matchup category tracker shows current margins and flags categories close enough to potentially flip. But the tool is explicit that closeness in a category isn’t the same as flippability. Flippability depends on remaining games, bench depth, opponent lineup decisions, and category volatility across the scoring period. None of those factors are modeled. The insight says: this category has a small margin. It doesn’t say you can flip it by starting a specific pitcher tonight. That second inference requires information the tool doesn’t have, and the tool says so rather than supplying a confident answer from incomplete inputs.

This design makes the tool less immediately satisfying. Users who want to be told what to do will find it frustrating. The tradeoff is that when the tool does make a claim, the claim is honest about its basis. A tool that always has an answer is often wrong in quiet ways. A tool that knows its limits builds a different kind of trust.

Technical Architecture

The tool is vanilla JavaScript with no framework dependencies. The decision follows from the requirements.

React and Vue solve problems this codebase doesn’t have. They manage component state isolation across large applications with many collaborating developers. They solve the re-render coordination problem at scale. This application has one developer, a deliberately simple state model in the S object, and performance requirements that a virtual DOM reconciliation layer would work against. Every framework adds a bundle weight and an abstraction cost. The bundle delays initial parse time. The abstraction sits between the code and the DOM, adding overhead to every state change while the framework decides what needs to update. For an application that needs to build a composite ranking pipeline and render it to a table as fast as possible, those costs aren’t negligible and the problems being solved aren’t present.

The architecture substitutes framework patterns with direct decisions. State is a single documented global, S, with typed fields and a generational counter that increments on meaningful state changes and invalidates all derived caches. Component isolation is tab-scoped rendering: each tab has its own render function that produces an HTML string and sets innerHTML once. One DOM write per render. No diffing. No reconciliation. User interaction is handled by a single event listener that reads data-action attributes and dispatches to named handler functions. One listener serves the entire application. New interactive elements can be added anywhere without listener registration.

The analytics pipeline runs once per state change and produces a sorted composite array that every renderer treats as immutable. Z-scores, VORP, tier clustering, disagreement metrics, and empirical Bayes shrinkage run in a single pass over the player pool. The result is stable until the next data change triggers a full recalculation. Renderers read the composite array and produce output. They don’t run analysis. The analytics pipeline doesn’t touch the DOM. The separation keeps both sides simple and testable in isolation.

Caching is layered at multiple levels. In-memory caches handle expensive derived calculations within a session: scarcity tables, roster composition, VORP baselines. Browser localStorage handles persistence across sessions, keyed by league ID, season, and data generation counter to prevent stale reads. Edge caching on the server layer handles shared data: consensus rankings, MLB schedule data, and ADP sources don’t vary per user and don’t need to be fetched individually. Each cache TTL is tuned to the update frequency of the data it holds. Completed-week boxscores are cached permanently. Current-week data is cached for five minutes. Live MLB scoreboard data expires in sixty seconds. The effect is that most interactions hit local memory rather than the network, and the network requests that do fire are often served from a cache rather than a cold origin call.

A thin server layer handles API proxying and security headers. ESPN and Yahoo APIs require server-side requests. The proxy layer is stateless, adds consistent security headers, and returns responses with appropriate cache controls. It has no warm state to maintain and no session context to manage.

Performance

There is no build step.

No transpilation. No bundling. No minification pass. No webpack, no Vite, no esbuild. The JavaScript that ships to the browser is the JavaScript that was written, character for character. The browser’s parser receives it directly.

This is not a philosophical stance. It’s a performance decision with measurable consequences. A build step exists to solve problems that this codebase doesn’t have: code splitting across lazy-loaded routes, JSX compilation, TypeScript erasure, polyfill injection for older browsers. Each of those transformations adds indirection. The bundler rewrites import paths. The minifier renames variables. The source map generator builds a parallel representation so you can debug the code you wrote instead of the code that shipped. The result is a toolchain that sits between you and the thing you built, adding latency at deploy time and opacity at debug time.

Without a build step, there is nothing to configure, nothing to cache-invalidate, nothing that can silently transform your code into something that behaves differently than what you tested. When something breaks, the stack trace points at the actual line in the actual file. When something is slow, the profiler shows the actual function. There is no gap between the code and the truth about the code.

The analytics pipeline computes z-scores, VORP, tier clustering, empirical Bayes shrinkage, and positional scarcity for 500+ players in a single synchronous pass. Under 50 milliseconds. That pass runs once per state change. Every renderer downstream reads the result without recomputation.

First contentful paint happens before the network requests for league data have returned, because the initial render path (HTML parse, CSS apply, font load, first table render from cached composite data) has no framework initialization overhead competing for the main thread. Tab switches are instant: each tab’s content is generated as an HTML string and written to the DOM in a single innerHTML assignment. No virtual DOM diff. No reconciliation. One write, one paint.

Network requests are structured so that the user never waits for data they don’t need yet. The boot sequence loads consensus rankings first, renders a usable table, then fetches league data, then loads supplemental projections, then populates the free agent pool. Each phase renders its results as they arrive rather than blocking on the slowest request. The matchup scoreboard lazy-loads player stats only when the user opens a specific matchup. Statcast data loads only when the user explicitly requests it. The tool never fetches data speculatively. Every request is triggered by a user action or a visible render path.

All visual transitions and animations are CSS-only and compositor-thread-accelerated, decoupled entirely from JavaScript execution.

What Isn’t Modeled

Expected statistics don’t account for deliberate approach changes. A hitter who restructures his swing to generate more lift will produce exit velocity and launch angle profiles that differ from his historical baseline. If the contact quality improvement is real but the player’s expected statistics are being compared against a historical mean that predates the change, the regression signal will flag favorable divergence where there is simply a new baseline. The signal isn’t wrong exactly, the contact quality is what it is, but the direction of causation isn’t visible to the model.

Cross-platform projection sourcing isn’t implemented. A connected Yahoo league uses Yahoo’s rankings as the platform source. A connected ESPN league uses ESPN’s. There’s no mechanism to load ESPN projections into a Yahoo league or vice versa. FanGraphs imports serve as the primary analytical supplement and cover most of the use case, but the connected platform’s rankings are always the platform source.

The insights engine doesn’t model schedule remaining. It sees what has happened in the current scoring period. It doesn’t project what will happen based on remaining game counts, opponent strength, weather, or lineup decisions. This is a tractable problem. The current judgment is that a clearly-labeled, honest partial picture is more useful than a confidently-stated incomplete one.

The platform APIs that power the connection layer can change without notice. A working feature today can break tomorrow through no fault of the tool. This is the permanent condition of building on top of third-party platforms. The alternative is to not build the tool. That tradeoff is accepted.

That’s what this is. A ranked list that doesn’t pretend to be more certain than its inputs, built on APIs it doesn’t control, distributed for free to whoever finds it useful. The architecture follows from the analysis. The analysis follows from the premise. The premise is that uncertainty, shown clearly, is more useful than certainty, manufactured.

The list was always fiction. This is an attempt at something more honest.