Approximating the current radiation treatment plan by finding and using the most similar one in the plan database is a challenging task. In order to test its effectiveness, a suitably large patient database is needed. Unfortunately it is not possible to use the hospital resources directly due to confidentiality-related restrictions. Although it is possible to request sharing of a limited set of anonymized patient data by following the proper procedures, it is not enough in volume to be able to properly assess the performance of the plan extraction algorithm.
To circumvent this limitation, we have the option to multiply the available patient data by automatically generating a larger patient population by using the anonymized hospital data as input. To simulate a real patient population, it is necessary to consider the distribution of patient metadata and generate the simulated patients accordingly. Basic patient parameters such as weight and height are absolutely necessary (thus we can generate skinny/obese and short/tall patients), and relatively simple to implement. More complex parameters can also be simulated, such as certain medical conditions (e.g. scoliosis) and even gender. These, however, are not in the scope of the first phase of this project.