Paddock Setup Flow v2
Revised architecture — editor-first, no re-parse, format detection fixed, real-time enrichment
1. Main Flow — Upload to Enrichment
flowchart TD
START([Student drops syllabus PDF/DOCX]) --> EXTRACT["Text extraction\n(server-side)"]
EXTRACT --> FORMAT["Format detection\n(FIXED: count date+topic PAIRS,\nnot raw date lines)"]
FORMAT --> PARSE["LLM parse\n(3-tier cascade:\nregex dates → structural → LLM)"]
PARSE --> STREAM["Stream results to client\n(SSE: modules appear live)"]
STREAM --> EDITOR["SIDE-BY-SIDE EDITOR\n━━━━━━━━━━━━━━━━━━━\nLeft: uploaded syllabus text\nRight: parsed week plan\n+ meeting days question\n━━━━━━━━━━━━━━━━━━━\nAlways shown. Always editable."]
EDITOR --> QUALITY{Parse quality?}
QUALITY -->|"High confidence\n(dated/numbered,\n8+ milestones)"| EDITOR_LIGHT["Editor pre-filled\nStudent glances, confirms\nMinimal editing needed"]
QUALITY -->|"Medium confidence\n(topic-based, 8+ topics\nOR numbered, 4-7)"| EDITOR_MODERATE["Editor pre-filled\nMeeting days REQUIRED\nMay need date adjustment\nFormat note shown if ambiguous"]
QUALITY -->|"Low confidence\n(< 4 milestones\nor garbled)"| EDITOR_HEAVY["Editor shows best attempt\nStudent corrects heavily\nOr starts fresh from syllabus text"]
EDITOR_LIGHT --> CREATE
EDITOR_MODERATE --> CREATE
EDITOR_HEAVY --> CREATE
CREATE[["Create workspace\n+ start enrichment"]]
CREATE --> WORKSPACE["→ /paddock/courseId\nCourse workspace\nWeek plan visible immediately\nScenarios stream in via SSE"]
style EDITOR fill:#fef3c7,stroke:#f59e0b,color:#78350f
style EDITOR_LIGHT fill:#d1fae5,stroke:#10b981,color:#064e3b
style EDITOR_MODERATE fill:#fef3c7,stroke:#f59e0b,color:#78350f
style EDITOR_HEAVY fill:#fee2e2,stroke:#ef4444,color:#7f1d1d
style CREATE fill:#d1fae5,stroke:#10b981,color:#064e3b
style WORKSPACE fill:#d1fae5,stroke:#10b981,color:#064e3b
style FORMAT fill:#dbeafe,stroke:#3b82f6,color:#1e3a5a
style PARSE fill:#dbeafe,stroke:#3b82f6,color:#1e3a5a
style STREAM fill:#ede9fe,stroke:#8b5cf6,color:#3b0764
Key change: Every parse quality level lands in the same editor. The difference is how much is pre-filled. No retry loop. No re-parse. The editor IS the recovery mechanism.
2. The Editor — Side-by-Side Layout
flowchart LR
subgraph left ["LEFT PANEL — Syllabus (readonly)"]
direction TB
L1["Uploaded syllabus text"]
L2["Detected sections highlighted"]
L3["Scrollable, searchable"]
L4["Student references this\nwhile editing the plan"]
end
subgraph right ["RIGHT PANEL — Parsed Plan (editable)"]
direction TB
R0["Meeting days selector\n(pre-filled if detected, required if not)\n━━━━━━━━━━━━"]
R1["Week 1 · Jan 13\nTitle: [Negligence ]\nTopics: Duty, Standard of Care"]
R2["Week 2 · Jan 20\nTitle: [Strict Liability ]\nTopics: Abnormally dangerous"]
R3["Week 3 · Jan 27\nTitle: [Products Liability ]\nTopics: Manufacturing, Design"]
R4["..."]
R5["Week 14 · Apr 14\nTitle: [Review ]"]
R6["━━━━━━━━━━━━\n[+ Add week] [Create my course]"]
end
left ~~~ right
style left fill:#1a1a2e,stroke:#4a4a6a,color:#c0c0d0
style right fill:#1a2e1a,stroke:#2a5a2a,color:#a0d0a0
Editor capabilities: Edit topic titles. Adjust dates. Add/remove weeks. Change meeting days (triggers re-distribution). All inline, no modals. Syllabus text on the left for reference — student can verify the plan matches their document.
3. Format Detection — Current vs Fixed
Current (broken)
flowchart LR
subgraph current ["CURRENT: Counts raw lines"]
direction TB
C1["Scan every line"]
C2{"Line starts\nwith date?"}
C2 -->|Yes| C3["dateCount++"]
C2 -->|No| C4{"Line starts\nwith number?"}
C4 -->|Yes| C5["numberedCount++"]
C4 -->|No| C6{"Line starts\nwith Topic/Unit?"}
C6 -->|Yes| C7["topicCount++"]
C3 --> C8{"dateCount ≥ 5?"}
C8 -->|Yes| C9["formatHint = 'dated'"]
end
subgraph problem ["PROBLEM"]
direction TB
P1["'Jan 14: Office hours change'\n'Feb 3: Paper 1 due'\n'Mar 10: Midterm exam'\n'Apr 1: Paper 2 due'\n'May 5: Final exam'\n━━━━━━━━━━━━\n5 date lines → 'dated'\nBut these are ALL admin dates!\nThe syllabus is topic-based."]
end
current ~~~ problem
style current fill:#2e1a1a,stroke:#5a2a2a,color:#d0a0a0
style problem fill:#2e1a1a,stroke:#ef4444,color:#fca5a5
Fixed (proposed)
flowchart LR
subgraph fixed ["FIXED: Counts date+topic PAIRS"]
direction TB
F1["Scan every line"]
F2{"Line starts\nwith date?"}
F2 -->|Yes| F3{"Is it admin?\n(exam, due date,\noffice hours)"}
F3 -->|Yes| F4["Skip — not a topic"]
F3 -->|No| F5{"Has meaningful\ntopic text?"}
F5 -->|Yes| F6["dateTopicPairs++"]
F5 -->|No| F4
F6 --> F7{"pairs ≥ 4?"}
F7 -->|Yes| F8["formatHint = 'dated'"]
F7 -->|No| F9["Continue to\nnumbered/topic check"]
end
style fixed fill:#1a2e1a,stroke:#2a5a2a,color:#a0d0a0
Same fix for numbered detection: Filter out non-topic headers (Grading, Materials, Prerequisites, Attendance) BEFORE counting. A syllabus with "1. Grading, 2. Attendance, 3. Materials, 4. Texts, 5. Topics" should NOT trigger 'numbered' — only #5 is topical.
4. Confidence Matrix
| Has date+topic pairs? | Has week labels? | Milestone count | True format | Confidence | Editor behavior |
| 4+ genuine pairs | Maybe | 8+ | Dated | High | Pre-filled with real dates. Minimal editing. |
| 4+ genuine pairs | Maybe | 4-7 | Dated, short course | Medium | Pre-filled. Note: "We found a shorter schedule." |
| 0-3 pairs | Yes, 4+ | Any | Numbered | High | Pre-filled with week numbers. Dates synthesized. |
| 0-3 pairs | No | 8+ | Topic-based | Medium | Topics pre-filled. Meeting days REQUIRED. Dates synthesized after. |
| 0-3 pairs | No | 4-7 | Ambiguous | Low | Show what we found. Note: "We found fewer topics than expected." Hybrid format question if needed. |
| 0-3 pairs | No | <4 | Bad parse | Very low | Editor shows best attempt. Student corrects heavily or uses manual entry. |
5. Topic Distribution Formula
flowchart TD
INPUT["Inputs:\nN = number of topics\nM = meetings per week (1, 2, or 3)\nW = semester weeks (default 14)"]
INPUT --> CALC["total_meetings = W × M\nmeetings_per_topic = total_meetings / N"]
CALC --> RATIO{meetings_per_topic?}
RATIO -->|"≈ 1\n(N ≈ total_meetings)"| ONE_TO_ONE["1:1 mapping\nEach topic = one meeting\nLeftover meetings = flex/review"]
RATIO -->|"> 1.5\n(fewer topics than meetings)"| MULTI_SESSION["Multi-session topics\nLLM assigns weight:\n'Negligence' → 4 meetings\n'Review' → 1 meeting\nHeavy topics get more time"]
RATIO -->|"< 0.8\n(more topics than meetings)"| GROUPED["Grouped topics\nLLM combines related:\n'Products + Strict' → 1 meeting\n'Duty + Breach' → 1 meeting\nRelated topics share sessions"]
ONE_TO_ONE --> RESULT
MULTI_SESSION --> RESULT
GROUPED --> RESULT
RESULT["Generate week plan with\ncalendar dates from:\nsemester start + meeting days"]
style INPUT fill:#1a1a2e,stroke:#6366f1,color:#c0c0ff
style ONE_TO_ONE fill:#d1fae5,stroke:#10b981,color:#064e3b
style MULTI_SESSION fill:#fef3c7,stroke:#f59e0b,color:#78350f
style GROUPED fill:#fef3c7,stroke:#f59e0b,color:#78350f
style RESULT fill:#d1fae5,stroke:#10b981,color:#064e3b
Examples
| Course | Topics (N) | Meetings/wk (M) | Total meetings | Ratio | Strategy |
| Torts (1x/week seminar) | 12 | 1 | 14 | 1.17 | 1:1 + 2 flex weeks |
| Torts (2x/week lecture) | 12 | 2 | 28 | 2.33 | Multi-session: heavy topics get 3-4 meetings |
| Con Law (1x/week) | 18 | 1 | 14 | 0.78 | Grouped: related topics share weeks |
| Civ Pro (3x/week) | 10 | 3 | 42 | 4.2 | Multi-session: each topic is a ~week-long unit |
6. Real-Time Enrichment on Workspace
sequenceDiagram
participant S as Student Browser
participant WS as /paddock/courseId
participant API as Enrich API
participant DB as Database
participant LLM as LLM APIs
S->>WS: Navigate to course workspace
WS->>S: Render week plan (immediate)
WS->>API: Subscribe SSE /api/paddock/enrich-status/courseId
Note over API,LLM: Enrichment pipeline running...
API->>LLM: Concept extraction (Pass 2B)
LLM-->>API: Concepts for each module
API->>DB: Write concepts
API-->>S: SSE: step=concepts, status=complete
API->>LLM: Domain classification
LLM-->>API: Domain confirmed
API->>DB: Write domain
API-->>S: SSE: step=domain, status=complete
API->>LLM: Scenario matching
LLM-->>API: Matched scenarios
API->>DB: Write scenarios
API-->>S: SSE: step=scenarios, status=complete
Note over S: Scenarios appear in workspace live
API->>LLM: Scenario generation (if needed)
LLM-->>API: Generated scenarios
API->>DB: Write generated
API-->>S: SSE: step=generation, status=complete
API->>DB: Run precompute
API-->>S: SSE: step=precompute, status=complete
Note over S: "Your course is ready" banner
S->>S: Full workspace loaded, no refresh needed
Student experience: After creating the course, the student lands on the workspace with their week plan visible. Enrichment stages stream in — concepts appear, then scenarios populate, then the "ready" state activates. Feels alive, like watching a Vercel deployment. No page refresh needed.
7. Partially Temporal Syllabi
flowchart TD
subgraph syllabus ["Partially Temporal Syllabus"]
direction TB
S1["Week 1 (Jan 13): Negligence"]
S2["Week 2 (Jan 20): Strict Liability"]
S3["Week 3 (Jan 27): Products Liability"]
S4["Week 4 (Feb 3): Vicarious Liability"]
S5["Week 5 (Feb 10): Damages"]
S6["--- MIDTERM ---"]
S7["Topics after midterm:\n· Defenses\n· Causation\n· Emotional Distress\n· Economic Torts\n· Review"]
end
subgraph parsed ["Parsed Result"]
direction TB
P1["Weeks 1-5: FROM SYLLABUS\n(real dates, high confidence)"]
P2["━━━ boundary ━━━"]
P3["Weeks 6-14: SYNTHESIZED\n(topics distributed across\nremaining 9 weeks using\nmeeting frequency)"]
end
subgraph editor ["Editor Shows"]
direction TB
E1["Wk 1 · Jan 13 · Negligence ●"]
E2["Wk 2 · Jan 20 · Strict Liability ●"]
E3["Wk 3 · Jan 27 · Products Liability ●"]
E4["Wk 4 · Feb 3 · Vicarious Liability ●"]
E5["Wk 5 · Feb 10 · Damages ●"]
E6["━━ Synthesized from your topics ━━"]
E7["Wk 6 · Feb 24 · Defenses ○"]
E8["Wk 7 · Mar 3 · Causation ○"]
E9["Wk 8 · Mar 10 · Emotional Distress ○"]
E10["Wk 9 · Mar 17 · Economic Torts ○"]
E11["Wk 14 · Apr 14 · Review ○"]
end
syllabus --> parsed --> editor
style P1 fill:#d1fae5,stroke:#10b981,color:#064e3b
style P3 fill:#fef3c7,stroke:#f59e0b,color:#78350f
style E6 fill:#fef3c7,stroke:#f59e0b,color:#78350f
● = from syllabus, ○ = synthesized. The editor marks the boundary clearly. Student trusts the solid weeks, can adjust the synthesized ones. Honest about confidence level.