[PDK-1088] `pdk build` is disproportionately slow when a module contains lots of files Created: 2018/07/18  Updated: 2018/07/26  Resolved: 2018/07/18

Status: Closed
Project: Puppet Development Kit
Component/s: None
Affects Version/s: PDK 1.6.0
Fix Version/s: PDK 1.6.1

Type: Bug Priority: Normal
Reporter: Glenn Sarti Assignee: Jesse Scott
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 10 - 1803
PDK 1.6.0
Module - puppetlabs/windows


Attachments: PNG File procmon.png    
Issue Links:
Relates
relates to PDK-903 Don't scan .git directory when buildi... Resolved
Template:
Method Found: Needs Assessment
Release Notes: Bug Fix
Release Notes Summary: Resolved an issue where `pdk build` was disproportionately slow when a module directory tree contained lots of files.
QA Risk Assessment: Needs Assessment

 Description   

While doing a simple "pdk build" it took 86 seconds to build a tarball containing 28 files and 6 dirs. 4.7 million disk IOs.

Looking at the procmon logs, it looks like it's querying the entire directory structure when searching for each filee. After removing the .bundle directory pdk build time was back to a second or so.



 Comments   
Comment by Glenn Sarti [ 2018/07/18 ]

Repro;

On a windows box

Contents of .pdkignore

.git/
.*.sw[op]
.metadata
.yardoc
.yardwarns
*.iml
/.bundle/
/.idea/
/.vagrant/
/coverage/
/bin/
/doc/
/Gemfile.local
/Gemfile.lock
/junit/
/log/
/pkg/
/spec/fixtures/manifests/
/spec/fixtures/modules/
/tmp/
/vendor/
/convert_report.txt
/update_report.txt
.DS_Store

Comment by Glenn Sarti [ 2018/07/18 ]

I have a procmon log but it's 5GB in size.... But here's a summary

Notice that of the 44s of file access time, 41s was spent in the .bundle directory. Also note that it spent 1s in the .git directory. So for some reason the pdk build process is continually polling through the entire module directory.

Comment by Jesse Scott [ 2018/07/18 ]

I think I've figured this out and it's not really windows or .bundle specific. There is an inefficient loop in the code that constructs the list of pathspecs to ignore. Working on refactoring now. 

Generated at Wed Nov 13 12:31:45 PST 2019 using JIRA 7.7.1#77002-sha1:e75ca93d5574d9409c0630b81c894d9065296414.