Adding global input hashes in lage server worker #861
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR reduces the cost of BuildXL's hashing. Since it doesn't know anything about env glob and how it is being reused across pips. This is a lage (JS task runner) concept. So, it naively executes a hash on each file. We can salvage this perf optimization by just taking control the hashing of those entries and just pass it as an input file. Since each target can possibly get its own env glob pattern... we are going to output a global_inputs_hash file per target, but the calculations are cached internally to make it fast.
We are leveraging the "info" command to generate these since this is a guaranteed command that is run all the time by the buildxl. Note cached hits from BXL means that it doesn't even interact with lage exec call for that pip. That means we cannot prepopulate things at the exec calls. the "info" the best place to generate these as a prep step. We will only do this for the
--server
case since normal execs don't need this (it is only needed to optimize for buildxl)This pull request introduces several optimizations and improvements to the
lage
project, focusing on enhancing the efficiency of the BuildXL runs and improving the handling of global input hashes. The most important changes include adding new functionalities to handle global input hashes, refactoring existing methods, and updating the protobuf definitions to accommodate new fields.Enhancements to BuildXL Optimizations:
infoAction
function. This optimization helps speed up BuildXL runs by avoiding repeated file reads. (packages/cli/src/commands/info/action.ts
,[[1]](https://github.com/microsoft/lage/pull/861/files#diff-5d45065c5d72c3b1acb9f00b687053c6c6f987fa9f3f390d01e1c7a243d3df9fR155-R206)
,[[2]](https://github.com/microsoft/lage/pull/861/files#diff-5d45065c5d72c3b1acb9f00b687053c6c6f987fa9f3f390d01e1c7a243d3df9fR248-R251)
,[[3]](https://github.com/microsoft/lage/pull/861/files#diff-5d45065c5d72c3b1acb9f00b687053c6c6f987fa9f3f390d01e1c7a243d3df9fL200-L202)
,[[4]](https://github.com/microsoft/lage/pull/861/files#diff-5d45065c5d72c3b1acb9f00b687053c6c6f987fa9f3f390d01e1c7a243d3df9fL218-R276)
)getGlobalInputHashFilePath
function to determine the path for global input hash files. (packages/cli/src/commands/targetHashFilePath.ts
,[packages/cli/src/commands/targetHashFilePath.tsR1-R9](https://github.com/microsoft/lage/pull/861/files#diff-ba547c11c29e2e99e83302c76c27d57b07cbbb5722b6eedd4c3caef01b3aaf51R1-R9)
)Refactoring and Code Improvements:
generateCommand
function to use the newshouldRunWorkersAsService
helper function, improving code readability and maintainability. (packages/cli/src/commands/info/action.ts
,[[1]](https://github.com/microsoft/lage/pull/861/files#diff-5d45065c5d72c3b1acb9f00b687053c6c6f987fa9f3f390d01e1c7a243d3df9fL200-L202)
,[[2]](https://github.com/microsoft/lage/pull/861/files#diff-5d45065c5d72c3b1acb9f00b687053c6c6f987fa9f3f390d01e1c7a243d3df9fL218-R276)
)FileHasher
andhashStrings
exports to thehasher
package to support hash generation for global inputs. (packages/hasher/src/index.ts
,[packages/hasher/src/index.tsR4-R5](https://github.com/microsoft/lage/pull/861/files#diff-1ca6c8a7c411a9c12ff5236babab9eb2e3d60c4f4772b8562f5beae74e9ce6d5R4-R5)
)Protobuf and RPC Updates:
RunTargetResponse
message in the protobuf definition to include thecwd
andglobal_input_hash_file
fields, ensuring that the necessary data is transmitted during remote procedure calls. (packages/rpc/proto/lage/v1/lage.proto
,[packages/rpc/proto/lage/v1/lage.protoL16-R23](https://github.com/microsoft/lage/pull/861/files#diff-1cf9ffd1076b73fb018e4d7df07389a990b983465f604d6d956b1a7f18b859c2L16-R23)
)packages/rpc/src/gen/lage/v1/lage_pb.ts
,[[1]](https://github.com/microsoft/lage/pull/861/files#diff-1caefe9c23e7418845c96bfe307a41be1429c6dbfe8431be5a50bc27bf01ff8eL88-R125)
,[[2]](https://github.com/microsoft/lage/pull/861/files#diff-1caefe9c23e7418845c96bfe307a41be1429c6dbfe8431be5a50bc27bf01ff8eL132-R144)
)Additional Changes:
path
module inexecuteRemotely.ts
to handle file paths for global input hashes. (packages/cli/src/commands/exec/executeRemotely.ts
,[packages/cli/src/commands/exec/executeRemotely.tsR1](https://github.com/microsoft/lage/pull/861/files#diff-930be17d1ef5b16a8986a15af1a19fe24c197583c29c0ddfec89fef77db68d6bR1)
)createLageService
function to handle global input hash files and ensure the correct paths are used during task execution. (packages/cli/src/commands/server/lageService.ts
,[[1]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2L4-R4)
,[[2]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2R19)
,[[3]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2L165-L168)
,[[4]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2L236-R243)
,[[5]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2R257)
,[[6]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2R271-R272)
,[[7]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2R319-R326)
,[[8]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2R339-R346)
,[[9]](https://github.com/microsoft/lage/pull/861/files#diff-637e91c79aff43e711921eecd75f940dcc04834d68306ff53e84933591e001c2R355-R360)
)