Understand AI Code Generator “Agentless” by executing step by step
# Overview
A while ago, it was reported that Agentless achieved high performance on SWEBench. I decided to take a closer look at its details. In this blog, we will explore the fundamental features of Agentless and how to use it effectively. We will follow along with the steps outlined in the article Agentless: A New Approach to Using LLMs.
Agentless operates through three main steps:
- Localization: Uses LLMs and embeddings to identify relevant files and positions (File-level → Element-level → Line-level).
- Repair: Generates multiple patches per issue, consisting of line positions and diff-based changes.
- Validation and Selection: Executes regression tests (existing tests) and generates reproduction tests for the issue, selecting the final patch based on these results.
# Steps
## 1. Preparation
### Repo Setup
First, clone the repository and install the necessary dependencies.
gh repo clone OpenAutoCoder/agentless
cd agentless
pip install -r requirements.txt
### Repo Structure
Download the repo structure from https://github.com/OpenAutoCoder/Agentless/releases/tag/v1.5.0 and extract it. If the repo structure is not downloaded, the structure generation script will run, which takes time.
### Environment Variable Configuration
Set the required environment variables:
export OPENAI_API_KEY=xxxxxx
export PROJECT_FILE_LOC=~/Downloads/repo_structure/repo_structures
export PYTHONPATH=$PYTHONPATH:$(pwd)
The path ~/Downloads/repo_structure/repo_structures
should point to the extracted folder from the downloaded repo structure.
## 2. Localization
In this post, we will use a specific target ID (e.g., django__django-10914
) to retrieve relevant files.
Localization step identifies the location to change to resolve the target issue.
#### Retrieve Relevant Files at File Level
Run localization for related files:
python agentless/fl/localize.py --file_level --output_folder results/swe-bench-lite/file_level --num_threads 10 --skip_existing --target_id=django__django-10914
#### Identify Irrelevant Files
Run localization for irrelevant files:
python agentless/fl/localize.py --file_level --output_folder results/swe-bench-lite/file_level_irrelevant --num_threads 10 --skip_existing --target_id=django__django-10914 --irrelevant
#### Retrieve Relevant Files from Embeddings
In addition to the LLM based retrieval above, Agentless also retrieves relevant files using embeddings.
python agentless/fl/retrieve.py --index_type simple \
--filter_type given_files \
--filter_file results/swe-bench-lite/file_level_irrelevant/loc_outputs.jsonl \
--output_folder results/swe-bench-lite/retrievel_embedding \
--persist_dir embedding/swe-bench_simple \
--num_threads 10 \
--target_id=django__django-10914
#### Combine Results
Merge top N results from LLM and embeddings:
python agentless/fl/combine.py --retrieval_loc_file results/swe-bench-lite/retrievel_embedding/retrieve_locs.jsonl \
--model_loc_file results/swe-bench-lite/file_level/loc_outputs.jsonl \
--top_n 3 \
--output_folder results/swe-bench-lite/file_level_combined
#### Get Element-Level Relevance
Using combined locations, retrieve relevant elements:
python agentless/fl/combine.py --retrieval_loc_file results/swe-bench-lite/retrievel_embedding/retrieve_locs.jsonl \
--model_loc_file results/swe-bench-lite/file_level/loc_outputs.jsonl \
--top_n 3 \
--output_folder results/swe-bench-lite/file_level_combined
The prompt used in this step is:
Please provide the complete set of locations as either a class name, a function name, or a variable name.
Note that if you include a class, you do not need to list its specific methods.
You can include either the entire class or don't include the class name and instead include specific methods in the class.
### Examples:
```
full_path1/file1.py
function: my_function_1
class: MyClass1
function: MyClass2.my_method
full_path2/file2.py
variable: my_var
function: MyClass3.my_method
full_path3/file3.py
function: my_function_2
function: my_function_3
function: MyClass4.my_method_1
class: MyClass5
```
Return just the locations wrapped with ```.
Example elements:
{"django/core/files/storage.py": ["class: FileSystemStorage"], "django/conf/global_settings.py": ["variable: FILE_UPLOAD_PERMISSIONS"], "django/core/files/uploadhandler.py": [""]}
#### Generate Line-Level Change Samples
Identify the line number for relevant elements extracted in the previous step. You can specify --num_samples
to decide the number of samples that we can compare in the subsequent steps:
python agentless/fl/localize.py --fine_grain_line_level \
--output_folder results/swe-bench-lite/edit_location_samples \
--top_n 3 \
--compress \
--temperature 0.8 \
--num_samples 4 \
--start_file results/swe-bench-lite/related_elements/loc_outputs.jsonl \
--num_threads 10 \
--skip_existing \
--target_id=django__django-10914
(part of ) the prompt:
Please provide the class name, function or method name, or the exact line numbers that need to be edited.
The possible location outputs should be either \"class\", \"function\" or \"line\".
### Examples:
```
full_path1/file1.py
line: 10
class: MyClass1
line: 51
full_path2/file2.py
function: MyClass2.my_method
line: 12
full_path3/file3.py
function: my_function
line: 24
line: 156
```
Return just the location(s) wrapped with ```.
example result:
["```\ndjango/core/files/storage.py\nline: 260\nline: 217\n\ndjango/conf/global_settings.py\nline: 307\n```", "```\nfull_path1/django/core/files/storage.py\nline: 260\nline: 284\n\nfull_path2/django/conf/global_settings.py\nline: 307\n```", "```\ndjango/core/files/storage.py\nline: 260\n\ndjango/conf/global_settings.py\nline: 307\n```", "```\ndjango/conf/global_settings.py\nline: 307\n\ndjango/core/files/storage.py\nline: 260\n```"]
loc_output.jsonl
format:
instance_id
: task ID of the issuefound_files
: list of files localized by the modeladditional_artifact_loc_file
: raw output of the model during file-level localizationfile_traj
: trajectory of the model during file-level localization (e.g., # of tokens)found_related_locs
: dict of relevant code elements localized by the modeladditional_artifact_loc_related
: raw output of the model during relevant-code-level localizationrelated_loc_traj
: trajectory of the model during relevant-code-level localizationfound_edit_locs
: dict of edit locations localized by the modeladditional_artifact_loc_edit_location
: raw output of the model during edit-location-level localizationedit_loc_traj
: trajectory of the model during edit-location-level localization
#### edit location-level
python agentless/fl/localize.py --merge \
--output_folder results/swe-bench-lite/edit_location_individual \
--top_n 3 \
--num_samples 4 \
--start_file results/swe-bench-lite/edit_location_samples/loc_outputs.jsonl \
--target_id=django__django-10914
directories:
tree results/swe-bench-lite/edit_location_individual
results/swe-bench-lite/edit_location_individual
├── args.json
├── loc_merged_0-0_outputs.jsonl
├── loc_merged_1-1_outputs.jsonl
├── loc_merged_2-2_outputs.jsonl
├── loc_merged_3-3_outputs.jsonl
└── localization_logs
2 directories, 5 files
## 3. Repair
#### Generate Patches
Generate multiple patches per issue and perform voting to determine the final patch (example: results/swe-bench-lite/repair_sample_1
):
python agentless/repair/repair.py --loc_file results/swe-bench-lite/edit_location_individual/loc_merged_0-0_outputs.jsonl \
--output_folder results/swe-bench-lite/repair_sample_1 \
--loc_interval \
--top_n=3 \
--context_window=10 \
--max_samples 10 \
--cot \
--diff_format \
--gen_and_process \
--num_threads 2 \
--target_id=django__django-10914
Run for the remaining three samples:
for i in {1..3}; do
python agentless/repair/repair.py --loc_file results/swe-bench-lite/edit_location_individual/loc_merged_${i}-${i}_outputs.jsonl \
--output_folder results/swe-bench-lite/repair_sample_$((i+1)) \
--loc_interval \
--top_n=3 \
--context_window=10 \
--max_samples 10 \
--cot \
--diff_format \
--gen_and_process \
--num_threads 2 \
--target_id=django__django-10914
done
## 4. Patch Validation and Selection
Last step is to evaluate the patches and select the final patches. The evaluation includes regression test (running existing tests) and reproduction test (new test for the target issue).
### Regression Test Selection
Run regression tests for the issue:
python agentless/test/run_regression_tests.py --run_id generate_regression_tests \
--output_file results/swe-bench-lite/passing_tests.jsonl \
--instance_ids=django__django-10914
python agentless/test/select_regression_tests.py --passing_tests results/swe-bench-lite/passing_tests.jsonl \
--output_folder results/swe-bench-lite/select_regression
folder=results/swe-bench-lite/repair_sample_1
for num in {0..9..1}; do
run_id_prefix=$(basename $folder);
python agentless/test/run_regression_tests.py --regression_tests results/swe-bench-lite/select_regression/output.jsonl \
--predictions_path="${folder}/output_${num}_processed.jsonl" \
--run_id="${run_id_prefix}_regression_${num}" --num_workers 10;
done
#### Generate Reproduction Tests
Generate reproduction tests for the target issue:
python agentless/test/generate_reproduction_tests.py --max_samples 40 \
--output_folder results/swe-bench-lite/reproduction_test_samples \
--num_threads 10
Run the tests against samples:
for st in {0..36..4}; do en=$((st + 3));
echo "Processing ${st} to ${en}";
for num in $(seq $st $en); do
echo "Processing ${num}";
python agentless/test/run_reproduction_tests.py --run_id="reproduction_test_generation_filter_sample_${num}" \
--test_jsonl="results/swe-bench-lite/reproduction_test_samples/output_${num}_processed_reproduction_test.jsonl" \
--num_workers 6 \
--testing;
done & done
for st in {0..36..4}; do en=$((st + 3));
echo "Processing ${st} to ${en}";
for num in $(seq $st $en); do
echo "Processing ${num}";
python agentless/test/run_reproduction_tests.py --run_id="reproduction_test_generation_filter_sample_${num}" \
--test_jsonl="results/swe-bench-lite/reproduction_test_samples/output_${num}_processed_reproduction_test.jsonl" \
--num_workers 6 \
--testing;
done & done
folder=results/swe-bench-lite/repair_sample_1
for num in {0..9..1}; do
run_id_prefix=$(basename $folder);
python agentless/test/run_reproduction_tests.py --test_jsonl results/swe-bench-lite/reproduction_test_samples/reproduction_tests.jsonl \
--predictions_path="${folder}/output_${num}_processed.jsonl" \
--run_id="${run_id_prefix}_reproduction_${num}" --num_workers 10;
done
### Reranking and Patch Selection
Lastly rerank the patches for the final selection:
python agentless/repair/rerank.py --patch_folder results/swe-bench-lite/repair_sample_1/,results/swe-bench-lite/repair_sample_2/,results/swe-bench-lite/repair_sample_3/,results/swe-bench-lite/repair_sample_4/ \
--num_samples 40 \
--deduplicate \
--regression \
--reproduction
# Summary
By using Agentless, we gained a clear understanding of the flow of AI-driven code generation. Through the steps of Localization, Repair, and Validation and Selection, we learned an efficient way to make code modifications.
In the next steps, we plan to explore how to apply this process to repositories other than SWEBench. Additionally, we will revisit the setup for Regression Tests to prepare for future executions. I encourage everyone to try out Agentless!