UNICA IRIS Institutional Research Information System

Large language models (LLMs) have become powerful tools that enable novice developers to generate production-level code. However, research has highlighted the security risks associated with such code generation, due to the high volume of generated software vulnerabilities. Recent studies have explored various techniques for automatically optimizing prompts to elicit desired responses from LLMs. Among these methods, Genetic Algorithms (GAs), which search for optimal solutions by evolving an initial population of candidates through iterative mutations, have gained attention as a lightweight and effective prompt optimization approach that does not require large datasets or access to model weights. However, their potential has not yet been examined in the context of secure code generation. In this paper, we use GA to develop a discrete prompt optimization pipeline specifically designed for secure code generation. We introduce two domain-specific prompt mutation techniques and assess how incorporating these security-focused mutations alongside general-purpose techniques, such as back translation and paraphrasing, affects the security of Python code generated by LLMs. Results demonstrate that our security-specific mutation techniques led to prompts with richer security context compared to the generic mutation techniques. Furthermore, combining these techniques with generic mutations substantially reduced the number of security weaknesses in the LLM-generated code. We also observed that prompts optimized for a particular LLM tend to perform best on that same model, highlighting the importance of model-specific prompt optimization.

Discrete Prompt Optimization Using Genetic Algorithm for Secure Python Code Generation

Tony, Catherine;Pintor, Maura^Secondo;Kretschmann, Max;Scandariato, Riccardo

2026-01-01

Abstract

Large language models (LLMs) have become powerful tools that enable novice developers to generate production-level code. However, research has highlighted the security risks associated with such code generation, due to the high volume of generated software vulnerabilities. Recent studies have explored various techniques for automatically optimizing prompts to elicit desired responses from LLMs. Among these methods, Genetic Algorithms (GAs), which search for optimal solutions by evolving an initial population of candidates through iterative mutations, have gained attention as a lightweight and effective prompt optimization approach that does not require large datasets or access to model weights. However, their potential has not yet been examined in the context of secure code generation. In this paper, we use GA to develop a discrete prompt optimization pipeline specifically designed for secure code generation. We introduce two domain-specific prompt mutation techniques and assess how incorporating these security-focused mutations alongside general-purpose techniques, such as back translation and paraphrasing, affects the security of Python code generated by LLMs. Results demonstrate that our security-specific mutation techniques led to prompts with richer security context compared to the generic mutation techniques. Furthermore, combining these techniques with generic mutations substantially reduced the number of security weaknesses in the LLM-generated code. We also observed that prompts optimized for a particular LLM tend to perform best on that same model, highlighting the importance of model-specific prompt optimization.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				LLMs;L Secure code generation; Prompt optimization; Genetic algorithms
			
	Tipologia:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0164121225003516-main_2.pdf accesso aperto Tipologia: versione editoriale (VoR) Dimensione 4.36 MB Formato Adobe PDF Visualizza/Apri	4.36 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11584/461825

Citazioni

ND

ND

0

social impact