fixups in formatting; minor other stuff.

This commit is contained in:
Nicole Dresselhaus
2025-06-29 13:46:55 +02:00
parent 2ba3c00de4
commit eb2d14ba98
13 changed files with 1186 additions and 1036 deletions

View File

@ -50,12 +50,12 @@ format:
## Introduction
In the fast eveolving field of AI there is a clear lack of reports on "what
In the fast evolving field of AI there is a clear lack of reports on "what
really works". Some techniques hailed as revolution (like the DeepSeek
Aha-Moment[@DeepSeek-AI2025DeepSeekR1IncentivizingReasoning]) for unlimited
potential were soon realized to "just" optimize nieche problems that can
potential were soon realized to "just" optimize niche problems that can
benchmarked[@Shao2025SpuriousRewardsRethinking]^[Like all decent humans i ain't
got time to read up on everything - so a big shoutout to
got time to read up on everything - so a big shoot-out to
[@bycloud2025LLMsRLRevelation] for doing thorough reviews on ML-topics and
linking the respective papers!].
@ -74,7 +74,7 @@ papers tested their finding on (qwen-series) also gets better with
RLVR-optimization if rewards are random instead of verified].
Therefore see this "Field Guide" for what it is: A current state of things that
work for at least 1 individuum in exactly this ecosystem at this point in time.
work for at least 1 Individuum in exactly this ecosystem at this point in time.
## How to program with Cursor
@ -83,7 +83,7 @@ In essence [Cursor](https://cursor.com) is "just" a fork of
functionality: Automatically injecting files into LLM-Prompts, offering
tool-aware LLMs to use [MCP](https://modelcontextprotocol.io/introduction)s,
read the filesystem, execute arbitrary commands in the shell (either
automomatically or after permission), getting feedback from the editor (i.e.
automatically or after permission), getting feedback from the editor (i.e.
installed linters, language-servers etc.) and thus have the same (or even
better) information/tools available as the programmer in front of the screen.
@ -97,10 +97,10 @@ they want - it can't be proven and (especially under US-Law) is not even
possible to resist lawful orders (including the gag-order to not talk about
these).
In practise one feels the direct pain points more severly. Some regular examples
include generating redundant code, because the current context was not aware of
utility-modules and functions it could use - leading to huge technical debt in
no time.
In practise one feels the direct pain points more severely. Some regular
examples include generating redundant code, because the current context was not
aware of utility-modules and functions it could use - leading to huge technical
debt in no time.
Therefore my preferred workflow is to "think bigger". Imagine being a product
owner of a huge, sluggish company. The left hand never knows what the right hand
@ -137,9 +137,9 @@ The main theme always follows a similar pattern:
what is _out of scope_.
- Pin the desired behaviour in a **Specification**.
Either this means changing currently established specifications (i.e.
bug/chang) or writing complete new ones (i.e. feature).
bug/change) or writing complete new ones (i.e. feature).
- Investigate **Spec-Compliance**.
Again the agentlooks at the codebase to identify _where_ things should change
Again the agent looks at the codebase to identify _where_ things should change
and _how_. Also recommendation are made on how it could achieve the goal.
- Generate **Tasks**.
From the compliance-report of spec-deviations (either from a bug or from a
@ -231,11 +231,27 @@ onesentence feature description plus any additional Q&A with the stakeholder.
### Output
Create /tasks/<feature>/PRD.md • Markdown only no prose, no codefences. •
File structure: # <Feature title> ## 1. Problem / Motivation ## 2. Goals ## 3.
NonGoals ## 4. Target Users & Personas ## 5. User Stories (Gherkin
“Given/When/Then”) ## 6. Acceptance Criteria ## 7. Technical Notes /
Dependencies ## 8. Open Questions
- Create /tasks/<feature>/PRD.md
- Markdown only no prose, no codefences.
- File structure:
> # <Feature title>
>
> ## 1. Problem / Motivation
>
> ## 2. Goals
>
> ## 3. NonGoals
>
> ## 4. Target Users & Personas
>
> ## 5. User Stories (Gherkin “Given/When/Then”)
>
> ## 6. Acceptance Criteria
>
> ## 7. Technical Notes / Dependencies
>
> ## 8. Open Questions
### Process
@ -247,23 +263,26 @@ Dependencies ## 8. Open Questions
ask for further clarification up to 3 times following this schema, else flag
it in _Open Questions_.
4. After questions are answered reply exactly: Ready to generate the PRD.
5. On a user message that contains only the word "go" (caseinsensitive):
Generate /tasks/<feature>/PRD.md following _Output_ spec. • Reply:
<feature>/PRD.md created review it.
5. On a user message that contains only the word "go" (caseinsensitive):
- Generate /tasks/<feature>/PRD.md following _Output_ spec.
- Reply: <feature>/PRD.md created review it.
6. STOP. Do **not** generate tasks or code in this rule.
### Writing guidelines
Keep each bullet ≤120 characters. • Use action verbs and measurable language.
• Leave TBDs only in _Open Questions_. • No business fluff pretend the reader
is a junior developer.
- Keep each bullet ≤120 characters.
- Use action verbs and measurable language.
- Leave TBDs only in _Open Questions_.
- No business fluff pretend the reader is a junior developer.
### Safety rails
Assume all work happens in a nonproduction environment, unless otherwise
stated or requested by you. • Do not include sensitive data or credentials in
the PRD. • Check the generated Document with `markdownlint` (if available),
apply auto-fixes and fix the remaining issues manually.
- Assume all work happens in a nonproduction environment, unless otherwise
stated or requested by you.
- Do not include sensitive data or credentials in the PRD.
- Check the generated Document with `markdownlint` (if available), apply
auto-fixes and fix the remaining issues manually.
```
:::
@ -305,19 +324,16 @@ Every specification should include:
```
2. **Scope and Boundaries**
- What is included/excluded
- Dependencies on other specifications
- Relationship to other components
3. **Detailed Requirements**
- Structured by logical sections
- Clear, unambiguous language
- Examples where helpful
4. **Error Handling**
- How errors should be handled
- Fallback behaviors
- Edge cases
@ -614,12 +630,13 @@ list that a junior developer (human or AI) can follow without extra context.
### Output
Create /tasks/<feature>/TASKS.md (overwrite if it exists). • Markdown only, no
prose around it. • Epics = H2 headings (`## 1. <Epic>`). • Tasks = unchecked
checkboxes (`- [ ] 1.1 <task>`). • Subtasks = indent one space under their
parent (` - [ ] 1.1.1 <subtask>`). • Create a
/tasks/<feature>/Task*<Epic>*<task>\_<subtask>.md (i.e. `Task_3_2_4.md` for Epic
3, Task 2, Subtask 4)
- Create /tasks/<feature>/TASKS.md (overwrite if it exists).
- Markdown only, no prose around it.
- Epics = H2 headings (`## 1. <Epic>`).
- Tasks = unchecked checkboxes (`- [ ] 1.1 <task>`).
- Subtasks = indent one space under their parent (` - [ ] 1.1.1 <subtask>`).
- Create a /tasks/<feature>/Task*<Epic>*<task>\_<subtask>.md (i.e.
`Task_3_2_4.md` for Epic 3, Task 2, Subtask 4)
### Process
@ -644,20 +661,22 @@ parent (` - [ ] 1.1.1 <subtask>`). • Create a
### Writing guidelines
Each item ≤120 characters, start with an action verb. • Hints are allowed
below each item as HTML-Comment and do not count against the 120 characters. •
Group related work into logical epics with ≤7 direct child items. • Prefer
concrete file paths, commands, specs or APIs when available. • Skip
implementation details obvious from the codebase in the overview. • If a task
only concerns up to 5 files, name them in the detailed file. Otherwise give
hints on how to search for them (i.e. "everything under src/models/").
- Each item ≤120 characters, start with an action verb.
- Hints are allowed below each item as HTML-Comment and do not count against the
120 characters.
- Group related work into logical epics with ≤7 direct child items.
- Prefer concrete file paths, commands, specs or APIs when available.
- Skip implementation details obvious from the codebase in the overview.
- If a task only concerns up to 5 files, name them in the detailed file.
Otherwise give hints on how to search for them (i.e. "everything under
`src/models/`").
### Safety rails
Never touch production data. • Assume all work happens in a feature branch,
never commit directly to main. • Check the generated Document with
`markdownlint` (if available), apply auto-fixes and fix the remaining issues
manually.
- Never touch production data.
- Assume all work happens in a feature branch, never commit directly to main.
- Check the generated Document with `markdownlint` (if available), apply
auto-fixes and fix the remaining issues manually.
```
:::
@ -675,9 +694,9 @@ specialised coding-LLMs are enough to get the job done with this preparation.
## Example: Rules in Action
The codebase we look at here is a project called `gitlab_overviewer`. It takes
gitlab-api-keys and generates nice overviews for tracking metadata in different
GitLab-api-keys and generates nice overviews for tracking metadata in different
projects across different groups. With a nice export to markdown (for rendering
in gitlab itself) and quarto (for exporting to i.e. confluence) with multiple
in GitLab itself) and quarto (for exporting to i.e. confluence) with multiple
pages etc. pp.
The current issue is, that due to a complete rewrite we are happy with the
@ -1225,7 +1244,7 @@ this method works and _can_ yields great outcomes. Even small discrepancies in
the codebase tend to pop up during spec-reviews (which can be automated!). Next
up would be running those in some kind of CI-job and integrating tools like
issue-tracking into the agent instead of simple markdown-files in the repository
as makeshift issue-tracker. But not by me for the forseeable future, so if you
as makeshift issue-tracker. But not by me for the foreseeable future, so if you
are looking for a project, feel free!
**All in all this isn't a silver bullet for all AI-assisted development