Assessment Criteria
Measuring the Non-Negotiable Dimensions of Entrepreneurial Success
Supsindex entrepreneurial indices form a structured assessment system designed to evaluate the critical, non-negotiable dimensions of founder capability—the human factors that most strongly determine whether a startup succeeds or fails.
Beyond a “Personality Test”
Why Founder Assessment Matters
Why do startups fail, and what do we mean by the Founder Effect?
Startups rarely fail due to a lack of passion or ambition. They fail because of specific, often measurable blind spots—in judgment, market understanding, behavioral resilience, and decision-making under pressure. Extensive industry analysis consistently shows that over 60% of startup failures are attributable to people-related factors: co-founder conflict, flawed decisions under uncertainty, misalignment with the market or ecosystem, and an inability to adapt when conditions change. These failures occur not because the idea was impossible, but because the human system executing it broke down.
The cost of this blind spot is enormous. Between 2019 and 2024 alone, hundreds of billions of dollars in global venture capital were lost to failures that were, in hindsight, preventable. Yet despite this, most ecosystems still rely on intuition, interviews, and surface-level signals to evaluate founders—the very layer where risk is highest.
What the ecosystem lacks is not ambition or capital, but accurate measurement of the human side of entrepreneurship.
What Supsindex Means by “Founder Soft Power”
In the context of Supsindex, “Soft Power” refers to the non-technical, human capabilities of a founder that materially influence entrepreneurial outcomes.
a founder’s cognitive understanding of entrepreneurship and market mechanics,
their behavioral judgment and decision-making patterns under uncertainty and pressure, and
their awareness of—and fit with—the ecosystem in which they operate.
These capabilities are complex, multidimensional, and dynamic. Existing assessment approaches treat them as fragmented traits—often isolated, difficult to compare, and poorly validated. As a result, decision-making defaults to intuition.
Supsindex exists to replace that intuition with structured, comparable, and scientifically grounded measurement—through a system of indices.
Supsindex Is an Assessment Engine — Not a Quiz
Supsindex is not a standard personality test, nor a lifestyle or “cosmopolitan” quiz designed for virality. It is a founder assessment engine built to measure the critical dimensions of entrepreneurial performance. At its current stage, Supsindex objectively measures three core dimensions that determine whether a founder can operate effectively under real entrepreneurial conditions.
In parallel, Supsindex is deliberately expanding toward a comprehensive framework for assessing entrepreneurial potential. As additional indices are deployed, further dimensions—such as team dynamics, decision-making under simulated pressure, and longitudinal founder growth—will be progressively unlocked, completing a full and evolving map of founder capability.
The Triangulated Assessment Engine (Current State)
As of today, the Supsindex Assessment Engine is triangulated, targeting three foundational pillars of founder soft power at the individual level:
Cognitive Power
Can you understand and structure the problem correctly?
Measures the accuracy, coherence, and practical usability of a founder’s entrepreneurial knowledge and mental models.
Behavioral Judgment
Will you make sound decisions under pressure and uncertainty?
Assesses behavioral patterns, resilience, ethical grounding, bias exposure, and decision quality in stressful or ambiguous conditions.
Ecosystem Fit
Do you understand the rules, norms, and constraints of your target ecosystem?
Evaluates a founder’s awareness of market structures, regulations, cultural dynamics, and ecosystem-specific realities.
These three dimensions are fully measurable today through Supsindex’s live indices!
Supsindex Methodology & Scoring
Supsindex moves beyond simplistic scoring models where every question is treated equally. Instead, the platform employs advanced psychometric and decision-science methods, including:
- Item Response Theory (IRT) for ability-weighted scoring,
- Thurstonian modeling to recover latent decision preferences,
- Item Response Theory (IRT) for ability-weighted scoring,
- Thurstonian modeling to recover latent decision preferences,
These methods are widely used in high-stakes domains such as medical licensing, aviation, and military selection—fields where false positives carry unacceptable risk. Supsindex applies the same rigor to entrepreneurship, where the cost of misjudgment is measured in lost capital, wasted time, and unrealized human potential.
Supsindex does not ask only who you think you are. It measures how you are likely to perform when reality applies pressure.
Assessment Engine Pillar 1: FPA
Founder Public Awareness - The Cognitive Engine
Entrepreneurial Literacy
Signal Detection Ability
Cognitive Processing Power
The FPA operates as a “Dynamic Knowledge Engine”. Its question bank adapts based on a founder’s startup stage (e.g., Pre-Seed vs. Series A) and industry context (e.g., EdTech vs. FinTech).
- 50 general entrepreneurial and business questions
- 25 context-specific questions (industry + stage)
FPA Measurement Methodology
Using probabilistic models for scoring rather than deterministic
Scores reflect true underlying ability
Traditional tests weight every question equally, a scientifically flawed approach that treats trivial definitions (e.g., “What is B2B?”) the same as complex strategic diagnoses that require synthesis across multiple variables.
Supsindex applies a 2-Parameter Logistic (2PL) Item Response Theory (IRT) model, ensuring that scores reflect true underlying ability rather than surface-level correctness.
Difficulty Parameter
Derived empirically from thousands of founder responses, this parameter validates question difficulty based on real performance data—not expert intuition. Founders are rewarded for solving problems that challenge the majority of peers, demonstrating depth rather than surface knowledge.
Discrimination Parameter
Measures how effectively a question separates high-performers from low-performers. Correct answers on highly discriminative items—those strongly correlated with venture success—contribute significantly more to the final score than generic items. This allows the algorithm to learn which questions actually matter.
(The “Distractor” Mechanism)
A founder’s ability of filtering signals from noise
To measure this executive function, the FPA intentionally includes Distractor questions—content that appears technical or important but is statistically irrelevant to startup success (e.g., trivia about ergonomic chair angles, font ligatures, or obsolete coding syntax).
Using Signal Detection Theory, we calculate D-Prime (d’) sensitivity:
- Hit Rate (Sensitivity): Correctly identifying irrelevant noise
- False Alarm Rate (Specificity): Incorrectly dismissing a real strategic issue
A high D-Prime score requires both vigilance and discernment. The model penalizes false alarms (ignoring real problems) as heavily as misses (falling for distractions), distinguishing focused founders from those who are inattentive, overly cynical, or guessing.
Assessment Engine Pillar 2: GEB
General Entrepreneur Behavior – The Behavioral Engine
Decision-Making Quality
Resilience Under Pressure
Cognitive Bias Susceptibility
The GEB operates as a “Behavioral Judgment Engine”. Instead of asking founders to describe themselves, it places them inside realistic entrepreneurial situations and evaluates how they choose to act under pressure, uncertainty, and constraint.
Distributed across 15 core behavioral categories, including:
- Resilience
- Ethical Integrity
- Resource Management
- Adaptability
- Strategic Judgment
GEB Measurement Methodology
Using comparative judgment and probabilistic modeling rather than self-reporting or fixed-trait scoring
Traditional personality tests suffer from Ipsative Bias. If a founder rates themselves highly on every positive trait, the assessment loses discriminatory power. Forced-choice formats also introduce artificial negative correlations—for example, appearing low in Adaptability simply because Integrity was prioritized. GEB is designed explicitly to overcome these limitations.
Comparative Judgment Modeling
In every entrepreneurial crisis scenario, founders are asked to select:
- the Most Effective action
- the Least Effective action
This forces real trade-offs under pressure, mirroring actual founder decision-making.
The Problem (Ipsative Data): In conventional scoring, choosing Option A implies not choosing Option B, creating mathematical dependency. This can falsely suppress traits (e.g., appearing “low” in Resilience simply because Ethics was prioritized).
Supsindex applies Thurstonian Item Response Theory to recover each founder’s latent utility values for every behavioral trait. Decisions are modeled as comparisons between underlying utilities rather than fixed choices.
This allows the system to mathematically demonstrate that a founder can be both highly Adaptable and highly Principled, instead of forcing artificial trade-offs.
Dual-layer behavioral profile, capturing both strengths and risks.
The Strength Engine: The GEB scoring engine measures 19 positive behavioral constructs that drive entrepreneurial success such as opportunity recognition, recombination, grit, and strategic persistence.
The Risk Engine (Bias Detection): Measures susceptibility to 20 cognitive derailers that frequently undermine early-stage companies.
Distractor Engineering: Incorrect options are intentionally engineered to be socially desirable. For example, a Micromanagement bias may be framed as “Supportive Checking-In.”
- Novice founders select it because it sounds virtuous
- Experienced founders reject it because it erodes autonomy
High Score = Low Risk: Consistently avoiding these socially desirable traps results in a higher safety score, signaling a preference for effectiveness over appearances.
Assessment Engine Pillar 3: EEA
Ecosystem Environmental Awareness – The Context Engine
Local Market Fit
Resilience Under Pressure
Contextual “Street Smarts”
The EEA operates as a “Contextual Intelligence Engine”. It measures whether a founder understands the specific environment in which they are operating—its rules, constraints, norms, and hidden dynamics—rather than relying on generic startup playbooks. A founder who succeeds in one ecosystem may fail in another if they attempt to apply the same assumptions unchanged. EEA is designed to measure this hyper-local readiness.
- 20 general ecosystem questions
- 20 industry-specific ecosystem questions
EEA Measurement Methodology
Ensuring cross-ecosystem fairness through equating, contextual benchmarking, and content freshness
Because the EEA generates different question sets for different regions and industries (e.g., US SaaS vs. UK BioTech), raw scores are not directly comparable. This statistical challenge—known as matrix sampling—is resolved through equating.
Cross-Context Fairness
Content Freshness Control
The Problem: Is a score of 800 in a relatively permissive regulatory environment (e.g., Delaware) equivalent to a score of 800 in a highly constrained one (e.g., Germany)? Raw scores would suggest yes, but the underlying competence required is fundamentally different!
The Solution: Anchor Item Equating. At least 20% of the questions in every EEA test deck are universal anchor items—identical across regions and industries. Performance on these anchors is used to statistically adjust the difficulty of the context-specific questions.
The Result: If anchor performance indicates strong general competence but regional scores appear lower, the system recognizes higher contextual difficulty and adjusts accordingly. This ensures that a score of 800 in a “hard” ecosystem reflects the same level of mastery as a score of 800 in an “easy” ecosystem—making EEA a universally valid measure of local readiness.
Content Freshness Control
Ecosystems evolve quickly—especially regulations, compliance frameworks, and institutional norms.
To prevent scoring founders against obsolete information, Supsindex enforces a Time-to-Live (TTL) policy on all EEA questions.
- Each question has a defined review cycle
- Once expired, it is automatically quarantined
- Human domain experts must revalidate it before reuse
This ensures founders are never penalized or rewarded based on outdated standards.
EEA Benchmarking System
Each founder is compared only within their Exact Contextual Bucket
- Same industry
- Same startup stage
- Same ecosystem
The normative dataset includes 1,000+ verified founder profiles, stratified for relevance. Benchmarks include:
Contextual Average: Mean score of direct peers
Top Decile (Top 10%): A realistic excellence benchmark, excluding statistical outliers
Each EEA assessment yields a score out of 1000. But why 1000?
Small margins in contextual understanding compound over time. Differences such as 750 vs. 785 meaningfully correlate with downstream outcomes like enterprise sales readiness, regulatory delays, and survival rates.
Confidence Intervals (Measurement Honesty): Every score is reported with a Confidence Interval (CI) (e.g., 780 ± 15), representing the Standard Error of Measurement (SEM). This acknowledges uncertainty and provides investors and institutions with a scientifically honest range for true capability.
Real-World Sample Scenarios
Early-stage market validation
Example Challenge: A founder reviews traction data and must distinguish vanity metrics (press mentions, total signups) from actionable indicators (retention, DAU).
Measurement Logic: Signal Detection + IRT-weighted knowledge items evaluate whether the founder correctly filters irrelevant noise without dismissing meaningful signals.
Capability Revealed: Entrepreneurial literacy, cognitive discipline, and depth of understanding under information overload.
Competitive shock under pressure
Example Challenge: A competitor launches a similar product at a significantly lower price point. The founder must choose the most and least effective response.
Measurement Logic: Thurstonian modeling recovers latent utilities behind choices, detecting both adaptive strategies and cognitive derailers (e.g., panic, overconfidence).
Capability Revealed: Decision-making quality, resilience, bias resistance, and strategic judgment in stressful conditions.
Market entry & compliance
Example Challenge: A SaaS founder entering the U.S. market must identify which compliance framework (HIPAA, SOC 2, PCI DSS) is required to sell to enterprise clients.
Measurement Logic: Anchor-item equating adjusts difficulty across ecosystems, ensuring scores reflect true contextual mastery rather than regulatory leniency.
Capability Revealed: Ecosystem literacy, regulatory awareness, and readiness to operate within real institutional constraints.
Supsindex Quartile Ranking Framework
Each Supsindex assessment produces a final score that is benchmarked against founders in the same contextual bucket (industry, stage, and ecosystem). Based on this comparison, founders are placed into one of four quartiles, translating raw performance into actionable interpretation.
Top 25%
Investable Grade
Strong alignment with the behavioral, cognitive, and contextual profiles of founders who successfully raise capital and scale ventures. Suitable for immediate investor engagement.
50–75%
High Potential
Solid foundational capability with identifiable, isolated gaps (e.g., strong product insight paired with micromanagement risk). Best addressed through targeted coaching or mentorship.
25–50%
Developing
Noticeable gaps in market understanding, ecosystem awareness, or decision frameworks. Requires structured entrepreneurial education and capability development.
Bottom 25%
Foundational
Elevated risk indicators detected (e.g., low resilience, high cognitive bias exposure). Suggests the need for role reconsideration, skill rebuilding, or complementary co-founder alignment.
Quartile placement is not a label of worth—it is a diagnostic signal, designed to guide better decisions, not gatekeep opportunity.
Fairness & Assessment Ethics
Talent is universally distributed, but opportunity is not
Our assessment engine is intentionally designed to reduce systemic bias and ensure ethical, statistically sound evaluation through the following mechanisms:
Adverse Impact Analysis
We continuously run Differential Item Functioning (DIF) analyses using Mantel–Haenszel statistics. If an item advantages or disadvantages a demographic group (holding ability constant), it is flagged, quarantined, and removed—ensuring fairness across gender, ethnicity, and background.
Model Fit Transparency
Supsindex does not rely on blind trust in algorithms. We actively monitor Model Fit Indices (including RMSEA and M₂ statistics) to confirm that scoring models accurately represent the underlying data structure.
Marginal Reliability
For complex forced-choice instruments such as the GEB, traditional reliability metrics (e.g., Cronbach’s Alpha) are mathematically invalid. Supsindex applies Marginal Reliability coefficients to ensure precision without statistical distortion.
Cultural Neutrality
The Thurstonian structure of the GEB reduces cultural response styles (such as uniform high self-ratings common in some cultures). By forcing comparative judgments, the system measures capability—not cultural expression, enabling global applicability.
Local Item Dependence Control
While classical IRT assumes question independence, real-world scenarios often involve clustered case-based items. Supsindex applies Testlet Response Theory (TRT) to correct for Local Item Dependence (LID), preventing artificial score inflation.
Data Privacy by Design
ounder data belongs to the founder. Supsindex follows Privacy-by-Design principles:
- All data is encrypted
- Benchmarking usesanonymized aggregates
- No behavioral data is sold or shared without explicit consent