Send email Copy Email Address
2025-07

Stealix: Model Stealing via Prompt Evolution

Summary

Modelstealingposesasignificantsecurityriskinmachinelearningbyenabling attackerstoreplicateablack-boxmodelwithoutaccesstoitstrainingdata, thus jeopardizing intellectual propertyandexposingsensitive information. Recent methods that usepre-traineddiffusionmodels fordatasynthesis improveefficiencyandperformancebut relyheavilyonmanuallycraftedprompts, limiting automationandscalability,especiallyforattackerswithlittleexpertise.Toassess therisksposedbyopen-sourcepre-trainedmodels,weproposeamorerealistic threatmodel thateliminates theneedforpromptdesignskillsorknowledgeof classnames. Inthiscontext,weintroduceStealix, thefirstapproachtoperform model stealingwithoutpredefinedprompts. Stealixuses twoopen-sourcepretrainedmodelstoinferthevictimmodel’sdatadistribution,anditerativelyrefines promptsthroughageneticalgorithmbasedonaproxymetric,progressivelyimprovingtheprecisionanddiversityofsyntheticimages.Ourexperimentalresults demonstratethatStealixsignificantlyoutperformsothermethods,eventhosewith access toclassnamesorfine-grainedprompts,whileoperatingunder thesame querybudget. Thesefindingshighlight thescalabilityofourapproachandsuggest that therisksposedbypre-trainedgenerativemodelsinmodelstealingmay begreaterthanpreviouslyrecognized.

Conference Paper

International Conference on Machine Learning (ICML)

Date published

2025-07

Date last modified

2025-05-23