Details
-
Improvement
-
Status: Accepted
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
Phoenix
Description
Hi
We have darn near everything on our systems managed by puppet, so almost all of the resource types we manage have a
resources { 'user':
|
purge => true
|
}
|
or the like
Well, that ends up, for each resource type, calling Puppet::Type::Resources#generate, which call resource_refs() for each resource instance (managed or unmanaged) in the below reject() block:
# Generate any new resources we need to manage. This is pretty hackish
|
# right now, because it only supports purging.
|
def generate
|
return [] unless self.purge?
|
resource_type.instances.
|
reject { |r| catalog.resource_refs.include? r.ref }.
|
select { |r| check(r) }.
|
select { |r| r.class.validproperty?(:ensure) }.
|
select { |r| able_to_ensure_absent?(r) }.
|
each { |resource|
|
@parameters.each do |name, param|
|
resource[name] = param.value if param.metaparam?
|
end
|
|
# Mark that we're purging, so transactions can handle relationships
|
# correctly
|
resource.purging
|
}
|
end
|
In our case, that was 7000 calls to resource_refs totaling 265 seconds of runtime (which is about half my agent runtime so there's a bit more tuning left to go) for 20 calls of generate(). (Seem to be 2 calls of generate() per managed type.) If you just factor out the call to resource_refs from the reject loop, assigning it to a local variable in generate() before calling reject(), this reduces the function runtime to 42 seconds (26 seconds in Array#reject and 15 in the instances call. If i turn the resource_refs into the keys of a hash, runtime drops to 18 seconds, but that's perhaps an optimization too far for people who don't have our insane catalog size. But, for purposes of being concrete, this is how i rewrote the first few lines of the function:
def generate
|
return [] unless self.purge?
|
refs = Hash[*catalog.resource_refs.zip([]).flatten]
|
resource_type.instances.
|
reject { |r| refs.include? r.ref }.
|
)