A modular, data-driven framework for financial analysis and intrinsic valuation
Overview
Over the past few months, I designed and implemented a modular Equity Valuation Engine in Python. The goal was straightforward but ambitious:
Build a clean, extensible system that extracts financial data from public sources, structures it into a coherent domain model, computes derived financial metrics, and runs multiple valuation methodologies on top of the same dataset.
This project combines financial modeling, software architecture, data normalization, and applied quantitative analysis. It is designed to be readable for finance professionals while structured in a way that reflects sound engineering principles.
High-Level Goals
The system was built around four core objectives:
- Standardize financial data into a strongly structured domain model.
- Decouple data sources from valuation logic using a repository pattern.
- Support multiple valuation methodologies for cross-model comparison.
- Enable reproducibility and extensibility through modular design.
Instead of treating valuation as a spreadsheet exercise, the project formalizes it as a composable software system.
System Architecture
The codebase follows a layered, domain-oriented architecture:
application/
metrics_loader/
valuations/
domain/
metrics/
valuation/
infrastructure/
repositories/
mappers/
calculations/Layer Responsibilities
Infrastructure Layer
Handles data access (currently viayfinance).
Responsible for raw extraction and normalization.Domain Layer
Contains typed financial models (StockMetrics,Financials,BalanceSheet, etc.)
Houses business logic and ratio calculations.Application Layer
Orchestrates workflows (loading metrics, running valuation engines).Calculations Module
Implements financial formulas (WACC, ROIC, DCF math, etc.).
This separation ensures:
- Clean abstraction boundaries
- Easier testing
- Future extensibility (e.g., replacing Yahoo Finance with EDGAR or a paid API)
Core Domain Model:StockMetrics
At the center of the system is a strongly typed dataclass:
StockMetrics
This object aggregates all financial and market data required for valuation.
It contains:
CompanyProfileFinancialsCashFlowBalanceSheetMarketDataValuationHistoricalDataRatios
Each subcomponent is a dedicated dataclass. After initialization, derived metrics are computed automatically (via__post_init__).
Example of Derived Computations
From primitive financial inputs, the system computes:
- Revenue growth rate
- Gross / Operating / Net margins
- ROE, ROIC, ROA
- Free Cash Flow (FCF)
- Cost of debt
- Effective tax rate
- WACC inputs
This design ensures:
Raw data is extracted once.
Business logic is centralized.
Valuation engines operate on normalized inputs.
Data Extraction: HowyfinanceFeeds the System
The project uses theyfinancelibrary as a data provider. However, it does not tightly couple the system to Yahoo’s raw structure.
Repository Pattern
YfinanceDataLoaderimplements aFinancialRepositoryinterface.
Its responsibilities:
Fetch:
- Income statement
- Balance sheet
- Cash flow statement
- Earnings history
- Market data
- Price history
Normalize:
- Inconsistent labels
- Quarterly vs annual formats
- Missing values
- Currency differences
Label Normalization
Yahoo Finance labels vary by ticker and geography.
To mitigate fragility, the system:
- Normalizes labels (lowercase, underscore formatting)
- Uses candidate label lists for matching
- Searches DataFrames programmatically
This makes the extraction logic more robust than direct column indexing.
TTM (Trailing Twelve Months) Assembly
When quarterly data exists:
- The loader extracts the last four quarters
- Sums them to compute TTM values
- Falls back to annual values if needed
This supports realistic DCF inputs based on current performance rather than outdated annual figures.
Currency Handling
The repository includes a conversion step to normalize financial data to USD when required. This avoids valuation distortion due to currency mismatch between financial statements and market pricing.
Valuation Models Implemented
The system supports three independent valuation approaches.
1. Discounted Cash Flow (DCF)
Measures: Intrinsic value based on projected free cash flows.
Core Logic
- Project FCF forward using growth assumptions
- Compute terminal value using perpetual growth
- Discount cash flows using WACC
- Divide by shares outstanding
Financial Inputs Used
- FCF (TTM)
- Revenue growth
- Operating margins
- Cost of debt
- Tax rate
- Capital structure
Characteristics
- Most theoretically grounded
- Highly sensitive to WACC and terminal growth
- Suitable for stable, cash-generating businesses
2. P/E Multiple Projection Model
Measures: Future earnings value based on EPS growth and exit multiple.
Core Logic
- Project EPS growth
- Apply target P/E multiple
- Discount back to present value
Characteristics
- Market-relative valuation
- Simpler than DCF
- Multiple assumption dominates outcome
3. ROE-Based Equity Model
Measures: Shareholder value via equity growth and dividend flows.
Core Logic
- Grow equity per share using ROE
- Model payout/dividends
- Discount shareholder cash flows
Characteristics
- Focuses on capital efficiency
- Suitable for high-ROE compounders
- Sensitive to reinvestment assumptions
Differences Between Models
| Model | Anchored On | Sensitive To | Best For |
|---|---|---|---|
| DCF | Free Cash Flow | WACC, terminal growth | Mature, stable firms |
| P/E | Earnings | Exit multiple | Comparable-based valuation |
| ROE | Return on Equity | ROE persistence | Capital-efficient firms |
Running all three provides triangulation rather than a single-point estimate.
Engineering Decisions & Design Tradeoffs
Strengths
- Clean separation of data, domain, and valuation logic
- Typed domain model improves reliability
- Mapper-based extraction reduces hard-coded logic
- Extensible repository pattern
- Clear valuation engine boundaries
Known Risks
yfinanceis not a guaranteed stable API- Label matching is inherently fragile
- TTM assembly assumes reporting continuity
- Valuation sensitivity remains high (as expected in finance)
Why This Project Matters
This project demonstrates:
- Applied financial modeling beyond spreadsheets
- Practical use of object-oriented design
- Data normalization and transformation logic
- Handling real-world API inconsistency
- Clear separation of concerns in architecture
- Translating financial theory into maintainable code
It bridges finance and software engineering.
For a technical recruiter, this shows:
- Domain-driven design thinking
- Production-minded architecture
- Practical quantitative implementation
- Strong abstraction discipline
- Ability to operationalize financial theory
Future Improvements
Planned enhancements include:
- Improved validation and logging around missing data
- Integration with alternative data providers (e.g., EDGAR parsing)
- Sensitivity analysis engine
- Monte Carlo valuation scenarios
- Data quality scoring
- Automated regression tests against historical snapshots
Closing Thoughts
Valuation is fundamentally about structured reasoning under uncertainty.
This project formalizes that reasoning into a composable system:
- Extract → Normalize → Model → Value → Compare
Rather than treating finance as a spreadsheet exercise, it treats it as a software problem — with abstractions, interfaces, domain boundaries, and testable components.
It is both a financial modeling engine and an architectural exercise in turning messy real-world data into structured, decision-ready insights.
