|
1 | 1 | # Roadmap |
2 | 2 |
|
3 | | -<img width="800" alt="image" src="https://github.com/gpt-engineer-org/gpt-engineer/assets/4467025/1af9a457-1ed8-4ab3-90e6-4fa52bbb8c09"> |
4 | | - |
5 | | - |
6 | | -There are three main milestones we believe will greatly increase gpt-engineer's reliability and capability: |
7 | | -- [x] Continuous evaluation of our progress 🎉 |
8 | | -- [ ] Test code and fix errors with LLMs |
9 | | -- [ ] Make code generation become small, verifiable steps |
10 | | - |
11 | | -## Our current focus: |
12 | | - |
13 | | -- [x] **Continuous evaluation of progress 🎉** |
14 | | - - [x] Create a step that asks “did it run/work/perfect” in the end of each run [#240](https://github.com/gpt-engineer-org/gpt-engineer/issues/240) 🎉 |
15 | | - - [x] Collect a dataset for gpt engineer to learn from, by storing code generation runs 🎉 |
16 | | - - [ ] Run the benchmark multiple times, and document the results for the different "step configs" [#239](https://github.com/gpt-engineer-org/gpt-engineer/issues/239) |
17 | | - - [ ] Improve the default config based on results |
18 | | -- [ ] **Self healing code** |
19 | | - - [ ] Run the generated tests |
20 | | - - [ ] Feed the results of failing tests back into LLM and ask it to fix the code |
21 | | -- [ ] **Let human give feedback** |
22 | | - - [ ] Ask human for what is not working as expected in a loop, and feed it into LLM to fix the code, until the human is happy |
23 | | -- [ ] **Improve existing projects** |
24 | | - - [ ] Decide on the "flow" for the CLI commands and where the project files are created |
25 | | - - [ ] Add an "improve code" command |
26 | | - - [ ] Benchmark capabilities against other tools |
27 | | - |
28 | | -## Experimental research |
29 | | -This is not our current focus, but if you are interested in experimenting: Please |
30 | | -create a thread in Discord #general and share your intentions and your findings as you |
31 | | -go along. High impact examples: |
32 | | -- [ ] **Make code generation become small, verifiable steps** |
33 | | - - [ ] Ask GPT4 to decide how to sequence the entire generation, and do one |
34 | | - prompt for each subcomponent |
35 | | - - [ ] For each small part, generate tests for that subpart, and do the loop of running the tests for each part, feeding |
36 | | -results into GPT4, and let it edit the code until they pass |
37 | | -- [ ] **Ad hoc experiments** |
38 | | - - [ ] Try Microsoft guidance, and benchmark if this helps improve performance |
39 | | - - [ ] Dynamic planning: Let gpt-engineer plan which "steps" to carry out itself, depending on the |
40 | | -task, by giving it few shot example of what are usually "the right-sized steps" to carry |
41 | | -out for such projects |
42 | | - |
43 | | -## Codebase improvements |
44 | | -By improving the codebase and developer ergonomics, we accelerate progress. Some examples: |
45 | | -- [ ] Set up automatic PR review for all PRs with e.g. Codium pr-agent |
46 | | -- [ ] LLM tests in CI: Run super small tests with GPT3.5 in CI, that check that simple code generation still works |
| 3 | +<img width="800" alt="image" src="https://github.com/gpt-engineer-org/gpt-engineer/assets/48092564/06bce891-00ef-4052-bbbd-77af8b843fff"> |
| 4 | + |
| 5 | + |
| 6 | +This document is a general roadmap guide to the gpt-engineer project's strategic direction. |
| 7 | +Our goal is to continually improve by focusing on three main pillars: |
| 8 | +- User Experience, |
| 9 | +- Technical Features, and |
| 10 | +- Performance Tracking/Testing. |
| 11 | + |
| 12 | +Each pillar is supported by a set of epics, reflecting our major goals and initiatives. |
| 13 | + |
| 14 | +## Main Pillars and Epics |
| 15 | +### 👥 User Experience |
| 16 | + |
| 17 | +- **Easy installation** - We are committed to ensuring our solution is accessible, with straightforward installation and clear instructions, complemented by Docker image instructions. |
| 18 | + |
| 19 | +- **Optimized workflows** - Our focus is on optimizing human-in-the-loop workflows to enhance productivity and ease of use. |
| 20 | + |
| 21 | +- **Git integration** - We facilitate seamless version control with automatic Git integration. |
| 22 | + |
| 23 | +- **Up-to-date documentation** - Providing concise, automatically updated documentation, including a practical Cookbook to help users make the most of gpt-engineer. |
| 24 | + |
| 25 | + |
| 26 | +### ⚙️ Technical Features |
| 27 | +- **Self-healing code** - Development of advanced self-healing code features to increase resilience and improve the general quality of generated code. |
| 28 | + |
| 29 | +- **Test generation** - Implementing automatic test generation to streamline development and ensure code reliability. |
| 30 | + |
| 31 | +- **Algorithm integration** - Establishing a pipeline for integrating the latest algorithms from research, keeping us at the cutting edge. |
| 32 | + |
| 33 | +- **Startup partnerships** - We are interested in forging partnerships with startups to utilize and showcase our technical features, fostering a collaborative ecosystem. |
| 34 | + |
| 35 | + |
| 36 | +### 📊 Performance Tracking/Testing |
| 37 | +- **Test coverage in CI** - Achieving complete test coverage in Continuous Integration (CI), including compatibility with various Python versions and Docker. |
| 38 | +- **Benchmark suite** - Creating a comprehensive benchmark suite, including SWE-bench, APPS, and more, to measure and improve performance. |
| 39 | +- **Infrastructure for benchmarks** - Setting up dedicated infrastructure for periodic benchmark evaluation to ensure consistent performance. |
| 40 | +- **Benchmark scoreboard** - Introducing a public benchmark scoreboard that is open to other agents, including time trace for transparent and competitive performance tracking. |
| 41 | + |
| 42 | + |
| 43 | +## Tracking Progress with GitHub Projects |
| 44 | + |
| 45 | +We are using [GitHub Projects](https://github.com/orgs/gpt-engineer-org/projects/3) to track the progress of our roadmap. |
| 46 | + |
| 47 | +Each issue within our project is categorized under one of the main pillars and, in most cases, associated epics. |
| 48 | + |
| 49 | +The issues can be in various states: |
| 50 | + |
| 51 | +- **Backlog** - Ideas and tasks waiting to be prioritized. |
| 52 | +- **Todo** - Tasks that are prioritized and planned for the upcoming sprints. |
| 53 | +- **In Progress** - Work that is actively being developed or implemented. |
| 54 | +- **Done** - Completed tasks that have met our definition of done. |
| 55 | +- **Discarded** - Tasks or ideas that have been considered but are not proceeding. |
| 56 | + |
| 57 | + |
| 58 | + |
| 59 | +## Organizing issues and roadmap with views |
| 60 | + |
| 61 | + |
| 62 | + |
| 63 | + |
| 64 | + |
| 65 | +The gpt-engineer project's roadmap is organized into several views, each offering a unique perspective on our work and progress. These views help the team and community to understand where things stand and what is being prioritized quickly. |
| 66 | + |
| 67 | +### General View |
| 68 | + |
| 69 | +The **[General View](https://github.com/orgs/gpt-engineer-org/projects/3/views/1)** provides a comprehensive list of all issues, epics, and their current status. In this view, you can see a list of issues, their titles, pillar categorizations, epics, status, and assignees. It's a master list from which we can access any specific issue's details and track its journey from conception to completion. This high-level view is perfect for getting a snapshot of the entire project's progress at a glance. |
| 70 | + |
| 71 | +### Kanban View |
| 72 | + |
| 73 | +The **[Kanban View](https://github.com/orgs/gpt-engineer-org/projects/3/views/2)** visualizes our workflow in a traditional Kanban board format. It is divided into columns representing the state of issues from "Backlog" to "Done," allowing us to track the work flow through our development process. We see it as a dynamic tool where tasks are moved through stages, and it provides a visual representation that helps maintain the flow of tasks and promptly address any bottlenecks. |
| 74 | + |
| 75 | +### By Pillars |
| 76 | + |
| 77 | +The **[By Pillars](https://github.com/orgs/gpt-engineer-org/projects/3/views/3)** View categorizes issues under the main pillars of User Experience, Technical Features, and Performance Tracking/Testing. This view is crucial for maintaining focus on our foundational pillars. It filters issues into groups based on the pillar they contribute to, ensuring that our development efforts are evenly distributed and aligned with our strategic goals. |
| 78 | + |
| 79 | +### By Epics |
| 80 | + |
| 81 | +The **[By Epics](https://github.com/orgs/gpt-engineer-org/projects/3/views/4)** View organizes issues according to the epics they belong to. This detailed breakdown helps us focus on the larger goals we are working towards and manage related tasks effectively. Epics are our large-scale objectives that span multiple issues. The By Epics View allows us to filter and manage tasks that collectively contribute to a significant, overarching goal, clarifying the progression of these key initiatives. |
| 82 | + |
47 | 83 |
|
48 | 84 | # How you can help out |
49 | 85 |
|
50 | 86 | You can: |
51 | 87 |
|
52 | | -- Post a "design" as a google doc in our Discord and ask for feedback to address one of the items in the roadmap |
| 88 | +- Post a "design" as a Google Doc in our [Discord](https://discord.com/channels/1119885301872070706/1120698764445880350), and ask for feedback to address one of the items in the roadmap |
53 | 89 | - Submit PRs to address one of the items in the roadmap |
54 | 90 | - Do a review of someone else's PR and propose next steps (further review, merge, close) |
55 | 91 |
|
56 | | -Volunteer work in any of these will get acknowledged. |
| 92 | +🙌 Volunteer work in any of these will get acknowledged.🙌 |
0 commit comments