{"id":2870,"date":"2026-02-16T11:25:25","date_gmt":"2026-02-16T16:25:25","guid":{"rendered":"https:\/\/ozer.gt\/log\/?p=2870"},"modified":"2026-02-16T11:35:40","modified_gmt":"2026-02-16T16:35:40","slug":"new-data-duets-post-using-generative-models-well-to-generate-data","status":"publish","type":"post","link":"https:\/\/ozer.gt\/log\/2026\/02\/16\/new-data-duets-post-using-generative-models-well-to-generate-data\/","title":{"rendered":"New Data Duets post: Using generative models, well, to generate data"},"content":{"rendered":"<p>I recently shared an underappreciated use case for generative models in data science: creating high-fidelity tabular datasets (OTA data for regression discontinuity).<\/p>\n<p>The model\u2019s success in data synthesis motivated a question: what are some high-value use cases for data science teams when using generative models to create datasets? This, in turn, led to our latest Data Duets post: \u201cUsing generative models, well, to generate data\u201d<\/p>\n<p>I walk through using the Synthetic Data Vault to scale a small OTA sample while preserving its statistical properties and the causal discontinuity. <a href=\"https:\/\/www.linkedin.com\/in\/duygudagli\">Duygu Dagli<\/a> then weighs in on business implications: creating statistical twins to share data with vendors for solution optimization and benchmarking, simulating product recall data, and solving cold start problems in retail.<\/p>\n<p>Ultimately the approach here represents a step toward data centricity: using high-fidelity simulations to dissect and validate the assumptions that drive our models.<\/p>\n<p><a href=\"https:\/\/www.dataduets.com\/2026\/02\/using-generative-models-well-to-generate-data.html\">Link to the full post<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently shared an underappreciated use case for generative models in data science: creating high-fidelity tabular datasets (OTA data for regression discontinuity). The model\u2019s success in data synthesis motivated a question: what are some high-value use cases for data science teams when using generative models to create datasets? This, in turn, led to our latest [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2921,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"cybocfi_hide_featured_image":"","footnotes":""},"categories":[1],"tags":[],"class_list":["post-2870","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/posts\/2870","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/comments?post=2870"}],"version-history":[{"count":57,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/posts\/2870\/revisions"}],"predecessor-version":[{"id":2943,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/posts\/2870\/revisions\/2943"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/media\/2921"}],"wp:attachment":[{"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/media?parent=2870"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/categories?post=2870"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ozer.gt\/log\/wp-json\/wp\/v2\/tags?post=2870"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}