I’ve been writing code professionally for 24 years, 15 of which has been Python and 9 years of that with Docker. I got tired of running into the same complications every time I started a new job, so I wrote this. Maybe you’ll find it useful, or it could even start a conversation, but this post has been a long time coming.
Update: I had a few requests for a demo repo as a companion to this post, so I wrote one today. It includes a very small Django demo user Docker, Compose, and GitLab CI.
It is not realistic to replicate a production setup in development when you’re working with sensitive user data. I’ve worked in different contexts (law enforcement, healthcare, financial services) where we’ve had complicated setups (in one instance including a thing called pre-staging environment), but never would a sizeable team of developers have access to user data, and thus to a realistic setup in terms of size, let alone of quality of data.
It sounds like you’re confusing the application with the data. Nothing in this model requires the use of production data.
Just trying not so confuse realistic testing with self-deception :) Not convinced testing with synthetic data can pretend to be similar to a production environment.