We developed a deep learning model for Building Information Modeling (BIM) software using scalable vector formats to enable flexible designing of floor plans in the industry.

Floor plans in the architectural domain can come from many sources that may or may not be in scalable vector format. The conversion of floor plan images to fully annotated vector images is a process that can now be realized by computer vision. Novel datasets in this field have been used to train Convolutional Neural Network (CNN) architectures for object detection. Image enhancement through SuperResolution (SR) is also an established CNN based network in computer vision that is used for converting low resolution images to high resolution ones. This work focuses on creating a multi-component module that stacks a SR model on a floor plan object detection model.

Read The Paper (Dev Khare et al.)

Floor-Plan-Detection dataset

Segmentation of floor plans needs to be precise for an end-to-end application. After observing the segmentation results of the empirical model for floor plans of various sizes, we concluded that there is a need for the CubiCasa approach to incorporate image enhancement as an essential preliminary step. CubiCasa5K is a large-scale floorplan image dataset containing 5000 samples annotated into over 80 floorplan object categories. The dataset annotations are performed in a dense and versatile manner by using polygons for separating the different objects. Our floorplan module was tested on 100 images from the CubiCasa5k dataset that had sizes less that 800x800. This was done to test the influence of super-resolution on low resolution images.

Download the Dataset

Our Floor-Plan-Detection uses multi-component module that can perform image enhancement and object detection.

Since the floor plan detection is an end to end conversation, the process need not be real-time. The main objective of the floor plan annotation task is to enhance the accuracy without constraints on time and computing power. To handle this, a novel multi-component module has been experimented that can perform image enhancement followed by the object detection. From the CubiCasa5k corpus we select ninety low resolution floor plans and observe the increase in performance after super-resolution.

This work performs super-resolution for image enhancement before detecting floor plan icons and room types; stacking super-resolution frameworks with the CubiCasa architecture results in a multi-component module that does just this. The networks chosen for super-resolution here are EDSR, ESPCN, FSRCNN, and LapSRN. For the inference, we need a quantifiable measure that could help us make a conclusive statement on improving performance by using super-resolution. For this, the accuracy scores have to be based on ground truth scaled by the same factor chosen for super-resolution; the CubiCasa5k corpus provides ground truth in scalable vector graphics (SVG) format making this possible.


The multi-component module performed room detection with higher accuracy than that of when using CubiCasa5k model alone. The best improvement in accuracy for the dataset chosen is 39.47%; EDSR is the SR model used in this case. EDSR also showed the best result from all the superresolution methods with a 12.17% improvement on average. This stacked Super-Resolution method has to be used during training to alter the dimensions of low resolution images. This could enhance the overall performance of the network and push it to a wider range of use cases.

Table 1. Room detection comparison of SR method with original.

Background 0.636 0.637 0.635 0.636 0.521
Outdoor 0.459 0.458 0.469 0.469 0.227
Wall 0.170 0.169 0.168 0.170 0.088
Kitchen 0.304 0.304 0.304 0.309 0.142
Living room 0.459 0.450 0.448 0.448 0.185
Bedroom 0.449 0.437 0.448 0.435 0.238
Bath 0.199 0.198 0.210 0.199 0.148
Entry 0.253 0.251 0.251 0.255 0.207
Railing 0.331 0.330 0.332 0.331 0.353
Storage 0.438 0.438 0.437 0.438 0.455
Garage 0.904 0.904 0.905 0.905 0.910
Undefined 0.207 0.210 0.208 0.208 0.159
micro avg 0.460 0.461 0.459 0.460 0.305

Table 2. Icon detection comparison of SR mothods with original.

No Icon 0.935 0.935 0.935 0.935 0.941
Window 0.109 0.097 0.099 0.106 0.025
Door 0.036 0.037 0.036 0.038 0.015
Closet 0.159 0.159 0.159 0.157 0.094
Electical Applience 0.119 0.122 0.120 0.122 0.070
Toilet 0.117 0.106 0.099 0.117 0.159
Sink 0.081 0.079 0.081 0.081 0.159
Sauna Bench 0.465 0.464 0.463 0.462 0.452
Fire Place 0.843 0.843 0.844 0.844 0.909
Bathtub 0.910 0.910 0.910 0.910 0.989
Chimney 0.978 0.978 0.978 0.978 1.000
micro avg 0.875 0.875 0.875 0.874 0.886

To learn more, check out our GitHub and read our publication presented at 3rd International Conference on Machine Learning, Image Processing, Network Security and Data Sciences


If you have queries about our work, contact us at: