E-mail senden E-Mail Adresse kopieren
2025-07

Stealix: Model Stealing via Prompt Evolution

Zusammenfassung

Modelstealingposesasignificantsecurityriskinmachinelearningbyenabling attackerstoreplicateablack-boxmodelwithoutaccesstoitstrainingdata, thus jeopardizing intellectual propertyandexposingsensitive information. Recent methods that usepre-traineddiffusionmodels fordatasynthesis improveefficiencyandperformancebut relyheavilyonmanuallycraftedprompts, limiting automationandscalability,especiallyforattackerswithlittleexpertise.Toassess therisksposedbyopen-sourcepre-trainedmodels,weproposeamorerealistic threatmodel thateliminates theneedforpromptdesignskillsorknowledgeof classnames. Inthiscontext,weintroduceStealix, thefirstapproachtoperform model stealingwithoutpredefinedprompts. Stealixuses twoopen-sourcepretrainedmodelstoinferthevictimmodel’sdatadistribution,anditerativelyrefines promptsthroughageneticalgorithmbasedonaproxymetric,progressivelyimprovingtheprecisionanddiversityofsyntheticimages.Ourexperimentalresults demonstratethatStealixsignificantlyoutperformsothermethods,eventhosewith access toclassnamesorfine-grainedprompts,whileoperatingunder thesame querybudget. Thesefindingshighlight thescalabilityofourapproachandsuggest that therisksposedbypre-trainedgenerativemodelsinmodelstealingmay begreaterthanpreviouslyrecognized.

Konferenzbeitrag

International Conference on Machine Learning (ICML)

Veröffentlichungsdatum

2025-07

Letztes Änderungsdatum

2025-05-23