[PDK-1088] `pdk build` is disproportionately slow when a module contains lots of files Created: 2018/07/18  Updated: 2018/07/26  Resolved: 2018/07/18

Status: Closed
Project: Puppet Development Kit
Component/s: None
Affects Version/s: PDK 1.6.0
Fix Version/s: PDK 1.6.1

Type: Bug Priority: Normal
Reporter: Glenn Sarti Assignee: Jesse Scott
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Windows 10 - 1803
PDK 1.6.0
Module - puppetlabs/windows

Attachments: PNG File procmon.png    
Issue Links:
relates to PDK-903 Don't scan .git directory when buildi... Resolved
Method Found: Needs Assessment
Release Notes: Bug Fix
Release Notes Summary: Resolved an issue where `pdk build` was disproportionately slow when a module directory tree contained lots of files.
QA Risk Assessment: Needs Assessment


While doing a simple "pdk build" it took 86 seconds to build a tarball containing 28 files and 6 dirs. 4.7 million disk IOs.

Looking at the procmon logs, it looks like it's querying the entire directory structure when searching for each filee. After removing the .bundle directory pdk build time was back to a second or so.

Comment by Glenn Sarti [ 2018/07/18 ]


On a windows box

Contents of .pdkignore


Comment by Glenn Sarti [ 2018/07/18 ]

I have a procmon log but it's 5GB in size.... But here's a summary

Notice that of the 44s of file access time, 41s was spent in the .bundle directory. Also note that it spent 1s in the .git directory. So for some reason the pdk build process is continually polling through the entire module directory.

Comment by Jesse Scott [ 2018/07/18 ]

I think I've figured this out and it's not really windows or .bundle specific. There is an inefficient loop in the code that constructs the list of pathspecs to ignore. Working on refactoring now. 

Generated at Mon Jan 20 07:16:32 PST 2020 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.