Yes. The right answer is not “store nothing.” The product needs to store enough feedback for the student to learn. The question is **what kind of feedback**, **where**, **for how long**, and **whether it preserves the uploaded question or a close derivative of it**.

The spec already points in the right direction: for user-uploaded prompts, SHEP should store “limited feedback,” while the uploaded prompt and prompt-derived packet are deleted or stored locally only; it also says feedback summaries may be stored, but should avoid close prompt paraphrases, while prompt embeddings, prompt text in logs, and prompt-derived packets should not be stored by default. 

I would add a dedicated retention rule:

**SHEP may store student answers and feedback for a limited bar-prep season, but must not store the uploaded prompt, the generated question packet, or close paraphrases of the prompt unless the user affirmatively saves them in a private/local mode.**

The clean product rule is:

```text
Store:
- student answer
- score band
- issue tags
- skill metrics
- generic feedback
- revision guidance
- timestamps and progress data

Do not store by default:
- uploaded question text
- prompt-derived fact map
- extracted calls
- answer key
- expected conclusion map
- generated rubric packet
- prompt embeddings
- model traces containing the prompt
```

The feedback can be useful without preserving the copyrighted expression. For example, this is safer stored feedback:

```json
{
  "submission_id": "sub_123",
  "user_id": "user_456",
  "mode": "user_uploaded_prompt",
  "subject": "Contracts",
  "score_band": 4,
  "issue_tags": ["formation", "consideration", "breach"],
  "feedback": {
    "issue_spotting": "You identified the main contract issue but missed one secondary issue.",
    "rule_accuracy": "Your rule statement should more clearly define consideration.",
    "application": "Your analysis needed a tighter connection between the facts and the legal standard.",
    "organization": "Use separate IRAC sections for each disputed issue."
  },
  "stored_prompt": false,
  "stored_prompt_packet": false,
  "expires_at": "2026-07-31T23:59:59-04:00"
}
```

This is riskier:

```json
{
  "key_facts": [
    "Painter agreed to paint Neighbor's house",
    "Neighbor promised to assign a claim",
    "Painter refused after learning the claim was disputed"
  ],
  "expected_conclusion": "Neighbor likely succeeds",
  "missing_facts": [
    "Student failed to discuss that the assigned claim was disputed"
  ]
}
```

The second one is more useful pedagogically, but it starts to look like a persistent condensed answer packet. If it is generated from an uploaded NCBE or California prompt, I would not store it on the backend by default.

## Recommended retention model

Use three buckets.

**Bucket 1: Durable learning record**

Store this in the user’s account through the bar season.

Contents:

Student answer
Score band
Subject
Issue tags
Skill metrics
High-level feedback
Revision checklist
Timing data
Confidence score
Prompt-storage flags

Retention:

Default expiration: **30 days after the relevant bar exam date**
User option: “Delete now”
User option: “Export my data”
Automatic deletion after expiration unless user affirmatively extends

For July 2026, a practical default would be:

```text
Default expires_at: August 31, 2026
```

That gives students a post-exam review/export window but avoids indefinite storage.

**Bucket 2: Prompt-derived session packet**

Do not store on backend by default.

Contents:

Detected calls
Fact hooks
Expected issue structure
Prompt-specific application map
Expected conclusions
Generated grading framework

Retention:

In memory / cache only while grading
Short TTL, e.g. 90 minutes to 24 hours
Not searchable
Not reusable
Not used for training or evals
Not embedded
Deleted after grading or session expiration

If you need crash recovery, use a very short TTL:

```text
Session packet TTL: 2 hours
Maximum emergency recovery TTL: 24 hours
```

**Bucket 3: Optional user-private save**

Only if the user affirmatively saves.

Two acceptable versions:

Local browser save:

```text
Stored in IndexedDB
Not stored on SHEP backend
User can delete locally
Best legal posture
```

Private backend save:

```text
Stored only in user workspace
Encrypted if feasible
Not reusable across users
Not searchable by SHEP content tools
Not used for training/evals
Expires by default
User can delete
Subject to DMCA/takedown workflow
```

If you use private backend save, treat it as “storage at the direction of a user.” Section 512 of the DMCA has a safe-harbor framework for user-directed storage, but it is conditional and requires things like notice-and-takedown and a repeat-infringer policy. ([Supreme Court][1])

## Recommended expiration periods

For the student’s own answer and generic feedback:

```text
Default: through the end of the bar season + 30 days
For July 2026: expires August 31, 2026
For February 2027: expires March 31, 2027
```

For uploaded prompt text:

```text
Default: never persisted
Transient processing only
Hard max if queued: 24 hours
```

For generated prompt packet:

```text
Default: not persisted
Runtime cache: 90 minutes
Hard max: 24 hours
```

For local-browser saved packet:

```text
User controlled
Show “Delete local packet” button
Optionally warn that clearing browser data removes it
```

For private backend saved packet, if you allow it:

```text
Default: 30 days
Maximum: until 30 days after the relevant exam
User can delete anytime
No indefinite save unless user explicitly opts in after warning
```

## The key distinction: “feedback” versus “packet”

I would define these terms in the database and product copy.

**Feedback** means the student-facing assessment of their answer. It should be phrased in general skill terms:

“Your rule statement was incomplete.”
“Your analysis needed more fact application.”
“You missed a secondary remedies issue.”
“Separate your discussion into clearer IRAC sections.”

**Packet** means the prompt-derived grading structure:

Fact hooks
Calls
Expected answer map
Expected conclusions
Issue weights
Rubric generated from the uploaded prompt

Feedback can be stored. Packet should usually not be stored.

## Technical implementation rule

Add these columns to the grading/session model:

```json
{
  "source_mode": "shelp_original | user_uploaded_prompt",
  "prompt_storage_policy": "none | transient | local_only | user_private_backend",
  "packet_storage_policy": "none | transient | local_only | user_private_backend",
  "stored_prompt": false,
  "stored_prompt_packet": false,
  "feedback_retention_until": "2026-08-31T23:59:59-04:00",
  "packet_expires_at": null,
  "user_can_delete": true,
  "used_for_training": false,
  "used_for_cross_user_reuse": false
}
```

If `source_mode = user_uploaded_prompt`, then enforce:

```text
stored_prompt = false by default
stored_prompt_packet = false by default
used_for_training = false always
used_for_cross_user_reuse = false always
```

Also add a deletion job:

```text
Every night:
- delete expired prompt packets
- delete expired private saved packets
- delete expired prompt-processing queues
- scrub orphaned trace/log references
- mark deletion audit record
```

## Privacy copy

Use this wording:

**For uploaded prompts, SHEP does not need to keep the prompt or generated grading packet to preserve your learning history. We store your answer, score, issue tags, and feedback so you can review your progress. Prompt-derived grading packets are temporary by default and are not added to SHEP’s shared rubric library.**

For expiration:

**Your uploaded-prompt feedback is kept through the bar-prep season and expires after the exam review period unless you delete it earlier or choose to export it.**

For local save:

**You may save a generated study packet locally in your browser. Local packets are stored on your device/browser, not in SHEP’s central rubric library.**

Do not say “we do not store prompts” unless your LLM provider, logs, queues, and tracing tools also honor that. OpenAI’s API data controls say abuse-monitoring logs may contain prompts and responses and are retained up to 30 days by default unless a customer is approved for modified abuse monitoring or zero data retention. ([OpenAI Developers][2]) Anthropic describes Zero Data Retention as a mode where customer data is not stored at rest after the API response returns, except where needed for law or misuse prevention. ([Claude][3])

## My product recommendation

Do this:

```text
1. Store student answers and generic feedback through the relevant bar season.
2. Expire uploaded-prompt feedback 30 days after the exam.
3. Do not store uploaded prompts by default.
4. Do not store generated packets by default.
5. Keep runtime packets in a short-lived cache only.
6. Allow optional local-browser packet save.
7. Avoid backend packet save for MVP.
8. Add user delete/export controls.
9. Add a retention disclosure in the upload flow.
```

The practical rule is:

**Keep the student’s learning record. Do not keep the uploaded question’s answer key.**

That gives users continuity without turning SHEP into a stored library of prompt-derived rubrics.

[1]: https://www.supremecourt.gov/DocketPDF/24/24-171/373201/20250829115749872_250829a%20Stat%20App%20for%20efiling.pdf?utm_source=chatgpt.com "Statutory Addendum 17 U.S.C. § 106 .. ..."
[2]: https://developers.openai.com/api/docs/guides/your-data?utm_source=chatgpt.com "Data controls in the OpenAI platform"
[3]: https://platform.claude.com/docs/en/build-with-claude/api-and-data-retention?utm_source=chatgpt.com "API and data retention - Claude Console"
