-
Notifications
You must be signed in to change notification settings - Fork 26
refactor: extract PythonFunctionOptimizer subclass #1699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 24 commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
1ad876f
refactor: extract Python-specific optimizer logic to languages/python…
KRRT7 c7f225d
refactor: use resolve_python_function_ast in FunctionOptimizer.__init__
KRRT7 c5cdefe
refactor: extract PythonFunctionOptimizer subclass from FunctionOptim…
KRRT7 64de536
fix: restore base replace_function_and_helpers, fix test imports, mov…
KRRT7 717c10d
fix: resolve mypy type error in resolve_python_function_ast
github-actions[bot] 524d1be
refactor: extract instrument_existing_tests into PythonFunctionOptimizer
KRRT7 f606250
refactor: extract instrument_async_for_mode and instrument_capture hooks
KRRT7 f96e518
refactor: remove redundant instrument_existing_tests override
KRRT7 d93506c
fix: resolve mypy type errors in new Python optimizer files
github-actions[bot] 3906743
refactor: extract should_check_coverage hook into PythonFunctionOptim…
KRRT7 f5715e7
refactor: extract collect_async_metrics hook into PythonFunctionOptim…
KRRT7 74d2efb
refactor: remove is_python() guards from code-restore sites in run_op…
KRRT7 2457da1
refactor: extract compare_candidate_results hook into PythonFunctionO…
KRRT7 c8b8074
refactor: extract should_skip_sqlite_cleanup hook into PythonFunction…
KRRT7 a2e3a0a
refactor: extract line_profiler_step and parse_line_profile_test_resu…
KRRT7 a3d81ba
refactor: remove redundant AST pre-resolution from create_function_op…
KRRT7 96c1459
chore: add TODO for get_code_optimization_context extraction, update …
KRRT7 a55841b
fix: use PythonFunctionOptimizer in tests that depend on Python-speci…
KRRT7 2b40e4b
Update codeflash/optimization/optimizer.py
KRRT7 6a916ac
fix: address review feedback on PythonFunctionOptimizer extraction
KRRT7 1f914c5
docs: update architecture.md with new Python optimizer files
github-actions[bot] 6f2939a
fix: restore 'Discovered' log message for E2E test compatibility
github-actions[bot] b6af185
fix: split discovery and instrumentation log messages for E2E harnesses
KRRT7 192cf0f
style: fix unused loop variable in function_optimizer.py
github-actions[bot] 4ccbffe
fix: combine conditions in line_profiler_step to avoid returning empt…
github-actions[bot] File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import ast | ||
| from pathlib import Path | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| from codeflash.cli_cmds.console import console, logger | ||
| from codeflash.code_utils.config_consts import TOTAL_LOOPING_TIME_EFFECTIVE | ||
| from codeflash.languages.python.context.unused_definition_remover import ( | ||
| detect_unused_helper_functions, | ||
| revert_unused_helper_functions, | ||
| ) | ||
| from codeflash.languages.python.optimizer import resolve_python_function_ast | ||
| from codeflash.languages.python.static_analysis.code_extractor import get_opt_review_metrics, is_numerical_code | ||
| from codeflash.languages.python.static_analysis.code_replacer import ( | ||
| add_custom_marker_to_all_tests, | ||
| modify_autouse_fixture, | ||
| ) | ||
| from codeflash.languages.python.static_analysis.line_profile_utils import add_decorator_imports, contains_jit_decorator | ||
| from codeflash.models.models import TestingMode, TestResults | ||
| from codeflash.optimization.function_optimizer import FunctionOptimizer | ||
| from codeflash.verification.parse_test_output import calculate_function_throughput_from_test_results | ||
|
|
||
| if TYPE_CHECKING: | ||
| from typing import Any | ||
|
|
||
| from codeflash.languages.base import Language | ||
| from codeflash.models.function_types import FunctionParent | ||
| from codeflash.models.models import ( | ||
| CodeOptimizationContext, | ||
| CodeStringsMarkdown, | ||
| ConcurrencyMetrics, | ||
| CoverageData, | ||
| OriginalCodeBaseline, | ||
| TestDiff, | ||
| ) | ||
|
|
||
|
|
||
| class PythonFunctionOptimizer(FunctionOptimizer): | ||
| def _resolve_function_ast( | ||
| self, source_code: str, function_name: str, parents: list[FunctionParent] | ||
| ) -> ast.FunctionDef | ast.AsyncFunctionDef | None: | ||
| original_module_ast = ast.parse(source_code) | ||
| return resolve_python_function_ast(function_name, parents, original_module_ast) | ||
|
|
||
| def analyze_code_characteristics(self, code_context: CodeOptimizationContext) -> None: | ||
| self.is_numerical_code = is_numerical_code(code_string=code_context.read_writable_code.flat) | ||
|
|
||
| def get_optimization_review_metrics( | ||
| self, | ||
| source_code: str, | ||
| file_path: Path, | ||
| qualified_name: str, | ||
| project_root: Path, | ||
| tests_root: Path, | ||
| language: Language, | ||
| ) -> str: | ||
| return get_opt_review_metrics(source_code, file_path, qualified_name, project_root, tests_root, language) | ||
|
|
||
| def instrument_test_fixtures(self, test_paths: list[Path]) -> dict[Path, list[str]] | None: | ||
| logger.info("Disabling all autouse fixtures associated with the generated test files") | ||
| original_conftest_content = modify_autouse_fixture(test_paths) | ||
| logger.info("Add custom marker to generated test files") | ||
| add_custom_marker_to_all_tests(test_paths) | ||
| return original_conftest_content | ||
|
|
||
| def instrument_capture(self, file_path_to_helper_classes: dict[Path, set[str]]) -> None: | ||
| from codeflash.verification.instrument_codeflash_capture import instrument_codeflash_capture | ||
|
|
||
| instrument_codeflash_capture(self.function_to_optimize, file_path_to_helper_classes, self.test_cfg.tests_root) | ||
|
|
||
| def should_check_coverage(self) -> bool: | ||
| return True | ||
|
|
||
| def collect_async_metrics( | ||
| self, | ||
| benchmarking_results: TestResults, | ||
| code_context: CodeOptimizationContext, | ||
| helper_code: dict[Path, str], | ||
| test_env: dict[str, str], | ||
| ) -> tuple[int | None, ConcurrencyMetrics | None]: | ||
| if not self.function_to_optimize.is_async: | ||
| return None, None | ||
|
|
||
| async_throughput = calculate_function_throughput_from_test_results( | ||
| benchmarking_results, self.function_to_optimize.function_name | ||
| ) | ||
| logger.debug(f"Async function throughput: {async_throughput} calls/second") | ||
|
|
||
| concurrency_metrics = self.run_concurrency_benchmark( | ||
| code_context=code_context, original_helper_code=helper_code, test_env=test_env | ||
| ) | ||
| if concurrency_metrics: | ||
| logger.debug( | ||
| f"Concurrency metrics: ratio={concurrency_metrics.concurrency_ratio:.2f}, " | ||
| f"seq={concurrency_metrics.sequential_time_ns}ns, conc={concurrency_metrics.concurrent_time_ns}ns" | ||
| ) | ||
| return async_throughput, concurrency_metrics | ||
|
|
||
| def instrument_async_for_mode(self, mode: TestingMode) -> None: | ||
| from codeflash.code_utils.instrument_existing_tests import add_async_decorator_to_function | ||
|
|
||
| add_async_decorator_to_function( | ||
| self.function_to_optimize.file_path, self.function_to_optimize, mode, project_root=self.project_root | ||
| ) | ||
|
|
||
| def should_skip_sqlite_cleanup(self, testing_type: TestingMode, optimization_iteration: int) -> bool: | ||
| return False | ||
|
|
||
| def parse_line_profile_test_results( | ||
| self, line_profiler_output_file: Path | None | ||
| ) -> tuple[TestResults | dict, CoverageData | None]: | ||
| from codeflash.verification.parse_line_profile_test_output import parse_line_profile_results | ||
|
|
||
| return parse_line_profile_results(line_profiler_output_file=line_profiler_output_file) | ||
|
|
||
| def compare_candidate_results( | ||
| self, | ||
| baseline_results: OriginalCodeBaseline, | ||
| candidate_behavior_results: TestResults, | ||
| optimization_candidate_index: int, | ||
| ) -> tuple[bool, list[TestDiff]]: | ||
| from codeflash.verification.equivalence import compare_test_results | ||
|
|
||
| return compare_test_results(baseline_results.behavior_test_results, candidate_behavior_results) | ||
|
|
||
| def replace_function_and_helpers_with_optimized_code( | ||
| self, | ||
| code_context: CodeOptimizationContext, | ||
| optimized_code: CodeStringsMarkdown, | ||
| original_helper_code: dict[Path, str], | ||
| ) -> bool: | ||
| did_update = super().replace_function_and_helpers_with_optimized_code( | ||
| code_context, optimized_code, original_helper_code | ||
| ) | ||
| unused_helpers = detect_unused_helper_functions(self.function_to_optimize, code_context, optimized_code) | ||
| if unused_helpers: | ||
| revert_unused_helper_functions(self.project_root, unused_helpers, original_helper_code) | ||
| return did_update | ||
|
|
||
| def line_profiler_step( | ||
| self, code_context: CodeOptimizationContext, original_helper_code: dict[Path, str], candidate_index: int | ||
| ) -> dict[str, Any]: | ||
| candidate_fto_code = Path(self.function_to_optimize.file_path).read_text("utf-8") | ||
| if contains_jit_decorator(candidate_fto_code): | ||
| logger.info( | ||
| f"Skipping line profiler for {self.function_to_optimize.function_name} - code contains JIT decorator" | ||
| ) | ||
| return {"timings": {}, "unit": 0, "str_out": ""} | ||
|
|
||
| for module_abspath in original_helper_code: | ||
| candidate_helper_code = Path(module_abspath).read_text("utf-8") | ||
| if contains_jit_decorator(candidate_helper_code): | ||
| logger.info( | ||
| f"Skipping line profiler for {self.function_to_optimize.function_name} - helper code contains JIT decorator" | ||
| ) | ||
| return {"timings": {}, "unit": 0, "str_out": ""} | ||
|
|
||
| try: | ||
| console.rule() | ||
|
|
||
| test_env = self.get_test_env( | ||
| codeflash_loop_index=0, codeflash_test_iteration=candidate_index, codeflash_tracer_disable=1 | ||
| ) | ||
| line_profiler_output_file = add_decorator_imports(self.function_to_optimize, code_context) | ||
| line_profile_results, _ = self.run_and_parse_tests( | ||
| testing_type=TestingMode.LINE_PROFILE, | ||
| test_env=test_env, | ||
| test_files=self.test_files, | ||
| optimization_iteration=0, | ||
| testing_time=TOTAL_LOOPING_TIME_EFFECTIVE, | ||
| enable_coverage=False, | ||
| code_context=code_context, | ||
| line_profiler_output_file=line_profiler_output_file, | ||
| ) | ||
| finally: | ||
| self.write_code_and_helpers( | ||
| self.function_to_optimize_source_code, original_helper_code, self.function_to_optimize.file_path | ||
| ) | ||
| if isinstance(line_profile_results, TestResults): | ||
| if not line_profile_results.test_results: | ||
| logger.warning( | ||
| f"Timeout occurred while running line profiler for original function {self.function_to_optimize.function_name}" | ||
| ) | ||
| return {"timings": {}, "unit": 0, "str_out": ""} | ||
| if line_profile_results["str_out"] == "": | ||
| logger.warning( | ||
| f"Couldn't run line profiler for original function {self.function_to_optimize.function_name}" | ||
| ) | ||
| return line_profile_results | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import ast | ||
| from typing import TYPE_CHECKING | ||
|
|
||
| from codeflash.cli_cmds.console import logger | ||
| from codeflash.models.models import ValidCode | ||
|
|
||
| if TYPE_CHECKING: | ||
| from pathlib import Path | ||
|
|
||
| from codeflash.models.function_types import FunctionParent | ||
|
|
||
|
|
||
| def prepare_python_module( | ||
| original_module_code: str, original_module_path: Path, project_root: Path | ||
| ) -> tuple[dict[Path, ValidCode], ast.Module] | None: | ||
| """Parse a Python module, normalize its code, and validate imported callee modules. | ||
|
|
||
| Returns a mapping of file paths to ValidCode (for the module and its imported callees) | ||
| plus the parsed AST, or None on syntax error. | ||
| """ | ||
| from codeflash.languages.python.static_analysis.code_replacer import normalize_code, normalize_node | ||
| from codeflash.languages.python.static_analysis.static_analysis import analyze_imported_modules | ||
|
|
||
| try: | ||
| original_module_ast = ast.parse(original_module_code) | ||
| except SyntaxError as e: | ||
| logger.warning(f"Syntax error parsing code in {original_module_path}: {e}") | ||
| logger.info("Skipping optimization due to file error.") | ||
| return None | ||
|
|
||
| normalized_original_module_code = ast.unparse(normalize_node(original_module_ast)) | ||
| validated_original_code: dict[Path, ValidCode] = { | ||
| original_module_path: ValidCode( | ||
| source_code=original_module_code, normalized_code=normalized_original_module_code | ||
| ) | ||
| } | ||
|
|
||
| imported_module_analyses = analyze_imported_modules(original_module_code, original_module_path, project_root) | ||
|
|
||
| for analysis in imported_module_analyses: | ||
| callee_original_code = analysis.file_path.read_text(encoding="utf8") | ||
| try: | ||
| normalized_callee_original_code = normalize_code(callee_original_code) | ||
| except SyntaxError as e: | ||
| logger.warning(f"Syntax error parsing code in callee module {analysis.file_path}: {e}") | ||
| logger.info("Skipping optimization due to helper file error.") | ||
| return None | ||
| validated_original_code[analysis.file_path] = ValidCode( | ||
| source_code=callee_original_code, normalized_code=normalized_callee_original_code | ||
| ) | ||
|
|
||
| return validated_original_code, original_module_ast | ||
|
|
||
|
|
||
| def resolve_python_function_ast( | ||
| function_name: str, parents: list[FunctionParent], module_ast: ast.Module | ||
| ) -> ast.FunctionDef | ast.AsyncFunctionDef | None: | ||
| """Look up a function/method AST node in a parsed Python module.""" | ||
| from codeflash.languages.python.static_analysis.static_analysis import get_first_top_level_function_or_method_ast | ||
|
|
||
| return get_first_top_level_function_or_method_ast(function_name, parents, module_ast) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.