Source Archival Improvements 2020

Over the past number of years we have been optimizing the (legacy) code which sits between TeamCity and our Custom Services, providing integration and performs the Source & Artefact Archival.

History

I thought it would be interesting to install some old versions of our custom TeamCity plugin in Dev and compare Source Archival performance using a real build with 34,200 files.

Note: The 34k file repository is one of the largest repositories used on ICG Build, it is good for testing performance but not recommended. Anything over 10k files is considered large.

2016:
     Source Preparation (28m:23s)
     Source Upload (1h:47m:47s)
     Total: 2h 16m  (136m)

2017:
     Source Preparation (28m:08s)
     Source Upload (30m:28s)
     Total: 58m 36s

2019:
     Source Preparation (1m:30s)
     Source Upload (2m:30s)
     Total: 4m 0s

Optimizing the existing code had given us good improvements over the years, and it was down to a reasonable time when all our agents were 4 CPU VMs. However, there was a problem with the introduction of Containerized Build Agents.
Running the same test on a containerized build agent with a CPU Request of 100 milli-cores and Limit of 1 Core:

2019 OpenShift Agent:
     Source Preparation (5m:40s)
     Source Upload (9m:0s)
     Total: 14m 40s

2020 Update

I had an opportunity to write a brand new Java Library (client side) to perform the integration and Source & Artefact Archival to our custom Services.

The new library performed a lot better!
Reuse is good, sometimes you need a rewrite, but make sure you can go back and delete the old code!

2020 VM:
     Source Upload (23s)
     Total: 23s
     =  90% reduction in processing time from previous version (4m 0s)

2020 ECS Agent
     Source Upload (47s)
     Total: 47s
     = 95% reduction in processing time from previous version (14m 40s)

* Source Preparation step is now combined into Source Upload step.

Putting these results into a graph helps show some of the improvements over the years:
Historical time taken in minutes to archive 34,000 files:

Focusing on the latest change:

Time taken in minutes to archive 34,000 files, on standard VM and Containerized Build Agents:

Real World Example

I found a build in Production with over 50k source files!

Before our latest release:

After our release:

Bulk Rest API

As part of the rewrite, I updated the Server side code to accept a list of json entities, instead of one per request.
I then updated the Client side code to make use of this. This significantly reduced our traffic as seen below:

Geoffrey Cummings
Geoffrey Cummings
Articles: 20

Leave a Reply