Table 2. Quantitative comparative experimental results in deep learning-based scene synthesis research [37].
Methods | Bedroom | Dining room | Living room | Library |
KL↓ | FID↓ | CAS(%) | KL↓ | FID↓ | CAS(%) | KL↓ | FID↓ | CAS(%) | KL↓ | FID↓ | CAS(%) |
FastSyn [35] | 6.4 | 88.1 | 88.3 | 51.8 | 58.9 | 93.5 | 17.6 | 66.6 | 94.5 | 43.1 | 86.6 | 81.5 |
SceneFormer [36] | 5.2 | 90.6 | 97.2 | 36.8 | 60.1 | 71.3 | 31.3 | 68.1 | 72.6 | 23.2 | 89.1 | 88.0 |
LayoutGPT [8] | 17.5 | 68.1 | 60.6 | - | - | - | 14.0 | 76.3 | 94.5 | - | - | - |
ATISS [5] | 8.6 | 73.0 | 61.1 | 15.6 | 47.6 | 69.1 | 14.1 | 43.3 | 76.4 | 10.1 | 75.3 | 61.7 |
COFS [28] | 5.0 | 73.2 | 61.0 | 9.3 | 43.1 | 76.1 | 8.1 | 35.9 | 78.9 | 6.7 | 75.7 | 66.2 |
DiffuScene [27] | 5.1 | 69.0 | 59.7 | 7.9 | 45.8 | 70.6 | 8.3 | 38.2 | 75.1 | - | - | - |
Forest2Seq [37] | 4.2 | 67.9 | 58.3 | 5.5 | 40.2 | 65.6 | 5.9 | 35.2 | 68.0 | 5.2 | 69.1 | 57.3 |