BlenderBIM spreadsheet writer for .xlsx and .ods

C by Coen on 2 Feb 2022 #

Does anybody maybe know how to get the filtered data from LibreOffice (.ods file) to Python. I've been reading a lot of threads on SO, but they all apply the filter in pandas.

I've made this filtering in LibreOffice,

Now I want to retrieve only the GlobalId values from that filtered list

I tried:


            if blenderbim_openoffice_xml_properties.my_file_path.endswith(".ods"):





                ###################################

                ### Get filtered data from .ods ###

                ###################################

                dataframe = pd.read_excel(blenderbim_openoffice_xml_properties.my_file_path, sheet_name=blenderbim_openoffice_xml_properties.my_workbook, engine="odf")



                print (dataframe['GlobalId'])

It returns each GlobalId from the .ods file. I only want the filtered items.

I can't use the openpyxl module, it's how I do it with .xlsx files like this for example:


                workbook_openpyxl = load_workbook(blenderbim_openoffice_xml_properties.my_file_path)

                worksheet_openpyxl = workbook_openpyxl[blenderbim_openoffice_xml_properties.my_workbook] 



                global_id_filtered_list = []


                for row in worksheet_openpyxl:     

                    if worksheet_openpyxl.row_dimensions[row[0].row].hidden == False:

                        for cell in row:  

                            if cell in worksheet_openpyxl['A']:  

                                global_id_filtered_list.append(cell.value)

G by Gorgious on 2 Feb 2022 #

The pandas module read_excel function has a skiprows argument but then you would need to evaluate the sheet to see wich rows are hidden before reading the file, so we're back at square 1...

C by Coen on 3 Feb 2022 # +7 votes

Update BlenderBIM spreadsheet:

I've commited the code today and added a new zip.

Screenshot of the add-on as of 3d february 2022:

New features:

.ods export possible, but no filtering IFC elements from an .ods file yet
Clear distinction between Area definitions, NetSideArea, NetArea. Area export each defintion of Area. Same for Volume and NetVolume.
A file dialog to load a spreadsheet in, so only one spreadsheet at the time can be used for filtering. So users can see from which spreadsheet file they are filtering in Blender.
Calculates total sum in .xlsx for NetArea and NetSideArea after filtering.

To do:

Apply autofilter in LibreOffice with an .ods file
Make IFC elements filtering possible from an .ods file

Made a small demo on youtube in English, really hope I explain it clear.

@duncan

I renamed my github repo, so all the links in the OP are broken. Don't know if you could maybe edit it?

G by Gorgious on 3 Feb 2022 # +1 votes

Awesome ! Do you think you could provide an example excel file that I could play with using your addon ?

C by Coen on 3 Feb 2022, edited 3 Feb 2022 #

@Gorgious said:

Awesome ! Do you think you could provide an example excel file that I could play with using your addon ?

Yes ofcourse :-), do you mean a template file or sample file? Or a .xlsx or .ods file? But the add-on creates them from any IFC file...

G by Gorgious on 3 Feb 2022 # +1 votes

Ideally I'd like a very big xlsx or ods file which which would come from your addon, which I don't have at the moment but you're right that I could generate it myself from an ifc :)

I've noticed the import operation takes quite a bit of time and I'd like to explore how you could maybe make it faster, if that's ok with you

C by Coen on 3 Feb 2022 #

@Gorgious said:

Ideally I'd like a very big xlsx or ods file which which would come from your addon, which I don't have at the moment but you're right that I could generate it myself from an ifc :)

I've noticed the import operation takes quite a bit of time and I'd like to explore how you could maybe make it faster, if that's ok with you

Yes ofcourse. :-D.

I've attached the .xlsx file of the Schependomlaan IFC.

Also tried to attached the .ods file but got a warning that the file format is not allowed.

V by vpajic on 3 Feb 2022 # +1 votes

@Coen you're starting to make me nervous with the pace of your developments ??. Fantastic work as always, I'm looking forward to testing this out intensively. I'll report back with any feedback I might have ;)

N by Nigel on 3 Feb 2022 # +1 votes

@Coen Great work. From beginning in CAD in the early 90s I have been frustrated with so much data being locked inside files without easy access, what you have done in a few months, I have waited for years to happen.

D by duncan on 3 Feb 2022 #

@Coen sure, tell me precisely what you want me to change to what.

C by Coen on 4 Feb 2022, edited 4 Feb 2022 #

@vpajic

you're starting to make me nervous with the pace of your developments

It would never got this pace if I haven't found osarch and the helpful people on here :-)

@Nigel

what you have done in a few months, I have waited for years to happen.

Well, I didn't do it alone. I just based my idea on work of others. :-)

@duncan

sure, tell me precisely what you want me to change to what.

Could you please remove all the images and replace it with this embedded youtube link?

https://youtu.be/_ZXUZQkcjlI

And could you also remove all the links, and just use this link please?

https://github.com/C-Claus/BlenderScripts/tree/master/BlenderBIMSpreadsheet

Thank you very much.

G by Gorgious on 4 Feb 2022, edited 4 Feb 2022 # +1 votes

@Coen Thanks for the file !

I think this is the main offender :


                for row in worksheet_openpyxl:     

                    if worksheet_openpyxl.row_dimensions[row[0].row].hidden == False:

                        for cell in row:  

                            if cell in worksheet_openpyxl['A']:  

                                global_id_filtered_list.append(cell.value)

You're iterating over each row, then iterating over each cell in the row, then checking if the cell is in the first column. You can just iterate over each row index, thus not needing to retrieve a cell object with all the columns, and simply checking if the row is not hidden by index, then access the cell with a string accessor. As long as the GlobalId is in the first column and the values start on row 3 this should work :


        for row_idx in range(3, worksheet_openpyxl.max_row + 1):

            if not worksheet_openpyxl.row_dimensions[row_idx].hidden:

                cell = worksheet_openpyxl[f"A{row_idx}"]

                global_id_filtered_list.append(str(cell.value))

I didn't time the difference because it was so long it didn't even compute within a few minutes with the before version on my computer. Using the after version, it computes in roughly 2 seconds, most of which is taken by loading the file in memory as far as I can tell.

I used the logic from this q&a from SO : https://stackoverflow.com/a/40027305

Also this gets rid of a potentially inefficient list slicing that runs for every ifc object a few lines lower :


                    obj.hide_viewport = data.get("GlobalId", False) not in global_id_filtered_list[1:]

That you can replace with


                    obj.hide_viewport = data.get("GlobalId", False) not in global_id_filtered_list

C by Coen on 4 Feb 2022 #

@Gorgious

I didn't time the difference because it was so long it didn't even compute within a few minutes

That made a huge difference, thank you very much. I timed the difference with 492 records filtered out in excel in the old way

63.61368060000001 seconds to show the IFC elements

After your refactor tips

2.4018126999999936 seconds to show the IFC elements

I committed the code straight away and added a new .zip file.

D by duncan on 4 Feb 2022 # +1 votes

@Coen said:

Could you please remove all the images and replace it with this embedded youtube link?

https://youtu.be/_ZXUZQkcjlI

And could you also remove all the links, and just use this link please?

https://github.com/C-Claus/BlenderScripts/tree/master/BlenderBIMSpreadsheet

Done, I think

D by duncan on 4 Feb 2022, edited 4 Feb 2022 # +3 votes

Let's push this project out a bit wider :-)

https://www.linkedin.com/feed/update/urn:li:activity:6895452758858706944

N by Nigel on 6 Feb 2022 # +1 votes

@Coen

'Well, I didn't do it alone. I just based my idea on work of others. :-)'

therein lies the achievement, making good use of what is already available

M by Moult on 6 Feb 2022 # +1 votes

@Coen perhaps when you feel the time is right, would you feel comfortable with writing up an article to be published on OSArch.org showcasing what you've achieved? It would be a fantastic thing to share with the community.

Also, good catch with the lack of Open Document format support on this forum! I've now added support for ODS, ODT and ODP.

G by Gorgious on 7 Feb 2022, edited 7 Feb 2022 # +3 votes

@Coen I'm a bit confused by the lack of possibility to tinker with .ods files using a python module. Pandas is nice to extract data, but I'd expect to be able to extract smart data. :)

After some digging around and reading some discussions about the subject I found a .ods is basically a zip file containing some xmls. Using this handy parser I've constructed a simple(r) parser that outputs a list of rows (o-based index) that are hidden.


# Import the needed modules

import zipfile

import xml.parsers.expat


# https://www.stefaanlippens.net/opendocumentpython/

# Get content xml data from OpenDocument file

ziparchive = zipfile.ZipFile(r"C:\Path\To\My\File.ods", "r")

xmldata = ziparchive.read("content.xml")

ziparchive.close()




class Element(list):

    def __init__(self, name, attrs):

        self.name = name

        self.attrs = attrs




class TreeBuilder:

    def __init__(self):

        self.root = Element("root", None)

        self.path = [self.root]

    def start_element(self, name, attrs):

        element = Element(name, attrs)

        self.path[-1].append(element)

        self.path.append(element)

    def end_element(self, name):

        assert name == self.path[-1].name

        self.path.pop()

    def char_data(self, data):

        self.path[-1].append(data)




def get_hidden_rows(node):

    row = 0

    for e in node:

        if not isinstance(e, Element):

            continue

        yield from get_hidden_rows(e)

        if e.name != "table:table-row":

            continue

        attrs = e.attrs

        rows = int(attrs.get("table:number-rows-repeated", 1))

        if "table:visibility" in attrs.keys():  # If the key is here, we can assume it's hidden (or can we ?)

            for row_idx in range(rows):

                yield row + row_idx

        row += rows






# Create parser and parsehandler

parser = xml.parsers.expat.ParserCreate()

treebuilder = TreeBuilder()

# Assign the handler functions

parser.StartElementHandler  = treebuilder.start_element

parser.EndElementHandler    = treebuilder.end_element

parser.CharacterDataHandler = treebuilder.char_data


# Parse the data

parser.Parse(xmldata, True)


hidden_rows = get_hidden_rows(treebuilder.root)  # This returns a generator object

print(list(hidden_rows))

I'm sure it can be optimised but it seems to do the job. I expect you can directly feed this list into pandas.read_excel skiprows argument.

outputs :

[1, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 38, 39, 40, 41, 42, 43, 44, 51, 52, 53, 54, 55, 56, 57, 58, 59]

A bit of explanation : Basically at the end of the context.xml file inside the ods archive, there are a bunch of definitions. If a row is hidden it will have a table:visibility="collapse" key / value. The rows are described from index 0 to infinity or until the last hidden row is defined. When two or more rows in succession have the same state a flag is added to the first row definition eg table:number-rows-repeated="3". From that we can iterate over all the row definitions and simply extract the rows that interest us.

Hope that helps ! If you need help implementing it into your add-on don't hesitate to ask.

Cheers

C by Coen on 8 Feb 2022 # +2 votes

@Moult

perhaps when you feel the time is right, would you feel comfortable with writing up an article to be published on OSArch.org showcasing what you've achieved? It would be a fantastic thing to share with the community.

You mean a technical article? Or personal? Or a tutorial?

@Gorgious

Thank you so much! I was really getting desperate, trying all kind of obscure python modules like pyexcel-ods and others to no avail.

I really was looking on how to modify the xml in the zip file. Turns out it comes standard with python. I implemented your pointers. Now it's possible to filter from an .ods file as well. :-D. I commited the code and added a new zip today.

After @duncans LinkedIn promotional activity, I got about 35 unique views on my Github page.

and I got a pull request from a github user called denizmaral. He pointed out the file would not open on MacOS and I approved his request. I have no way to test this obviously. It works on Windows 10, so if you read this, thank you :-)

I wanted to test if the add-on will work on Linux Ubuntu, so far so good:

But when I checked the add-on I get the following error:

ModuleNotFoundErro: No modude named 'pandas.libs.interval'

Need to see how I can fix this.

Also I think the way @Gorgious reads the xml file I think can also look for a method on how to apply the autofilter in the xml file. Making the assumption it happens in the content.xml. As I asked here on SO

Some screenshots from filtering from an .ods file.

Manually applying the autofilter in LibreOffice

Now possible to filter in Blender

And in general I did some refactoring, added new functions and tried to clean up the script a bit.

Thanks everyone for the help and testing so far.

D by duncan on 8 Feb 2022 # +1 votes

... since Excel claims to read .ods files (it reads them fine as far as I know) maybe the .xlsx export could even be dropped instead of trying to maintain function parity. Just saying.

G by Gorgious on 8 Feb 2022 # +1 votes

Nice !! Yeah we can investigate how the custom filters are handled in ods. Writing data will be more troublesome than parsing it, but it's definitely doable. A minimal working file with something like 2 columns and 2 rows and a filter should yield the precious lines at the bottom of the xml file if I'm not mistaken. After that it's "just" a matter of copying it correctly when writing the file back, without messing things up :)

Btw slight optimisation : You can replace


    for row_idx in range(rows):

     yield row + row_idx

with


     yield from range(row, row + rows)

A by avico on 13 Feb 2022 #

Hi Cohen,

Congratulations for your work. I think is a very good complement to Blende Bim.

I'm testing your addon and get an error with a model imported from here (the first model; the others I tested worked well):

https://ifcinfra.de/ifc-bridge/bridge-abschluss/#beispielmodelle

Export to "ods" format doesn't work and get the follow error:

location: <unknown location>:-1

Error: Python: Traceback (most recent call last):

File "C:\Users\Admin\AppData\Roaming\Blender Foundation\Blender\2.93\scripts\addons\BlenderBIMSpreadsheet.py", line 467, in execute

construct_data_frame = ConstructDataFrame(context)

File "C:\Users\Admin\AppData\Roaming\Blender Foundation\Blender\2.93\scripts\addons\BlenderBIMSpreadsheet.py", line 149, in init

ifc_dictionary['Classification'].append(self.get_classification_code(context, ifcproduct=product)[0])

File "C:\Users\Admin\AppData\Roaming\Blender Foundation\Blender\2.93\scripts\addons\BlenderBIMSpreadsheet.py", line 249, in get_classification_code

assembly_code_list.append(ifcproduct.HasAssociations[0].RelatingClassification.ItemReference)

File "C:\Users\Admin\AppData\Roaming\Blender Foundation\Blender\2.93\scripts\addons\blenderbim\libs\site\packages\ifcopenshell\entity_instance.py", line 122, in getattr

raise AttributeError(

AttributeError: entity instance of type 'IFC4X2.IfcClassificationReference' has no attribute 'ItemReference'

maybe is a problem with the ifc sample but it wild be good to "skip" this questions to obtain the spreadsheet file. Now, the process stop without results.

C by Coen on 13 Feb 2022, edited 13 Feb 2022 #

@avico

Thanks for the feedback, don't know when you downloaded the add-on. Because I fixed this bug two hours ago. If you get the latest version this error should dissappear. I discovered classification works a bit different in IFC4.

I fixed it like this, If anyone has any tips on how to improve it's highly appreciated. :-)


if ifcproduct.HasAssociations:



            if ifcproduct.HasAssociations[0].is_a() == 'IfcRelAssociatesClassification':



                if ifcproduct.HasAssociations[0].RelatingClassification:



                    #for IFC2x3

                    if ifc_version == 'IFC2X3':

                        if ifcproduct.HasAssociations[0].RelatingClassification.ItemReference:

                            assembly_code_list.append(ifcproduct.HasAssociations[0].RelatingClassification.ItemReference)



                    #for IFC4 

                    if ifc_version == 'IFC4':

                        if ifcproduct.HasAssociations[0].RelatingClassification.Identification:

                            assembly_code_list.append(ifcproduct.HasAssociations[0].RelatingClassification.Identification)



            if ifcproduct.HasAssociations[0].is_a() == 'IfcRelAssociatesMaterial':



                for i in ifcproduct.HasAssociations:

                    if i.is_a() == 'IfcRelAssociatesClassification':



                        #for IFC2x3

                        if ifc_version == 'IFC2X3':

                            if i.RelatingClassification.ItemReference:

                                assembly_code_list.append(i.RelatingClassification.ItemReference)



                        #for IFC4     

                        if ifc_version == 'IFC4':

                            if i.RelatingClassification:

                                assembly_code_list.append(i.RelatingClassification.Identification)

A by avico on 13 Feb 2022 #

Coen,

I downloaded the addon this morning (spanish time), some hours before the comment. I will continue trying this new version.

C by Coen on 23 Feb 2022 #

@avico

I downloaded the addon this morning (spanish time), some hours before the comment. I will continue trying this new version

Did it work?