Skip to content

Conversation

@ArneTR
Copy link
Member

@ArneTR ArneTR commented Dec 16, 2023

This PR introduces a new table on the "status" page with current job status of the machines as well as an approximate waiting time.

Waiting time is calculated as JOB_AMOUNT * 15 Minutes + JOB_AMOUNT * COOLDOWN_TIME

All config data is updated through the client.py script on cluster announce
Screenshot 2023-12-16 at 10 04 52 PM

@ArneTR ArneTR requested a review from ribalba December 16, 2023 21:06
@ArneTR
Copy link
Member Author

ArneTR commented Dec 16, 2023

@ribalba Any additional feature you would like to see on the status page?

@github-actions
Copy link

github-actions bot commented Dec 16, 2023

Old Energy Estimation

Eco-CI Output:

Label 🖥 avg. CPU utilization [%] 🔋 Total Energy [Joules] 🔌 avg. Power [Watts] Duration [Seconds]
Total Run 9.54754 1694.26 2.49155 687
Measurement #1 9.5599 1694.26 2.49155 682

📈 Energy graph:

 7.74 ┤ ╭╮ ╭╮ 7.14 ┤ ││ ││ 6.54 ┤ ││ ╭╮ ││ 5.95 ┤ ││ ││ ││╭╮ 5.35 ┤ ╭╯│ ╭╯╰╮ ││││ 4.75 ┤ ╭╮ ╭╮ ╭╮ │ ╰─╮╭───╮ ╭╯ ╰╮╭╯╰╯│ 4.16 ┤ ╭╯╰──╮ ╭╯╰╮ ╭─╮ ╭╮ ╭╯│ ╭─╮ │ ╰╯ ╰╮ │ ╰╯ ╰╮ ╭╮ ╭╮ ╭─╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭─╮ ╭╮ ╭╮ ╭╮ 3.56 ┤ ╭╮╭─╮│ ╰─╯ ╰─╮╭────╯ ╰────────╯╰───╯ ╰──╯ ╰╮ ╭╮ │ │ │ │ ╭╯│╭╯│╭╯ │ ╭╮╭╮ ╭╮ ╭──╯╰────────╯╰─╮ ╭╮╭─╮ ╭──╮ ╭──╮ ╭──╮ ╭──╮ │╰─╮ ╭──╮ ╭──╮ ╭──╮ ╭──╮ ╭╯╰─╮ ╭╮ ╭╮╭─╯╰─╮ ╭╮ ╭─╯╰─╮ ╭╮ ╭╯╰─╮ ╭╮╭╮ ╭╮ ╭╯╰─╮ ╭─╮ ╭╮ │ ╰─╮ ╭──╮ ╭╮ ╭╯╰─╮ ╭╮╭╮ ╭╮ ╭──╮ ╭─╮ ╭──╮ ╭─╮ ╭─╮ │╰──╮ ╭─╮╭──╮ ╭╮╭─╮ ╭─╮ ╭──╮ ╭╮ ╭──────╮ ╭╮ ╭╮ ││ 2.96 ┤ │││ ││ ╰╯ ╰╮│╰╮ ╭─╯ │ │ │ │ ││ ││ │ ╭╮││││ ││ │ │ │╰╯ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ╭╯ │ │ │ ││ ╭╯││ │ ╭╮ │╰╮│ │ ││ │ │ ││││ ││ │ │ │ │ ││ │ │ │ │ ││ │ │ ││││ ││ │ │ ╭╯ │ │ │ │ ╰╮ │ │ │ │ │ ││ │ │╰╯ │ │ ╰╮ │ │ ╭╯│ │ ╰╮ │╰╮ ││ │╰╮ 2.37 ┤ │││ ╰╯ ╰╯ │╭╮│ │ ╭╮│ │ │ ││ ╰╯ ╰╮ │││╰╯│ ││ │ ╰╮ │ │ ╭╯ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ││ │ ││ │ ││ │ ││ │ ╭╮ ││ │ ╰╮ ╭─╯╰╯│ ││ │ │ ╭╮│ ╰╮ ││ │ │ │ │ ││ │ │ ╭─╯╰╯│ ││ │ │ │ │ ╭╯ │ │ │ ╭╯ │ │ │ ╭╮│ ││ │ ╭╮│ │ ╭╮│ │ │ │ ╭╯ │╭╮ │ │ ╭╮│ ╰╮ ││ │ │ 1.77 ┼──────╯╰╯ ╰╯╰╯ ╰─╯╰╯ ╰───────╯ ╰╯ ╰───────╯╰╯ ╰────────╯╰─╯ ╰────────╯ ╰────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰────────╯ ╰─────────╯ ╰─────────╯╰───────╯ ╰╯ ╰─────────╯╰───────╯ ╰╯ ╰────────╯╰────────╯╰─╯ ╰───────╯ ╰────────╯╰─╯ ╰───────╯╰╯ ╰────────╯╰─╯ ╰─────────╯ ╰────────╯╰─╯ ╰────────╯ ╰────────╯╰─╯ ╰─────────╯ ╰────────╯ ╰─────────╯ ╰────────╯ ╰─────────╯ ╰───────╯╰╯ ╰╯ ╰────────╯╰╯ ╰───────╯╰╯ ╰──────────╯ ╰────────╯ ╰╯╰─────────╯ ╰───────╯╰╯ ╰────────╯╰─╯ ╰──────── Watts over time
Copy link
Member

@ribalba ribalba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks great

{ data: 6, title: 'Waiting Jobs'},
{ data: 5, title: 'Estimated waiting time', render: function(el, type, row) {
// 900 Seconds is current average job time. WE add this to the amount of waiting time
return `${( (900+row[5]) * row[6]) / 60} Minutes`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make the 900 calculated dynamically. As this might change depending on the workloads. Maybe take the AVG time of the last 10 runs? Couldn't this be done with an SQL subquery in the SELECT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good one. will do

* main: Hotfix for check on frequency provider Tests run_until must be guard-claused with cleanup routine (#616) Fix check if stderr is empty (#613) Bump uvicorn[standard] from 0.24.0.post1 to 0.25.0 (#612) Fxing the network provider stderror Branch and filename are now always not null (#602) Adds a more elaborate depends_on test Support reading notes from services (#590) docker build command in tests now checks reason for docker build failure. If it is a permission issue with the cache, it will continue the rest of the workflow (#576) Use depends_on for container startup order (refactored) (#593) Bump psycopg[binary] from 3.1.15 to 3.1.16 (#610) Added powercap info to hardware_info (#609) Changed wording for network infrastructure box (#608) Added SIGQUIT to nginx and initi to gunicorn, as we are using bash script in entrypoint (#605) Fix frontend flow menu to wrap automatically (#584) Bump psutil from 5.9.6 to 5.9.7 (#603) Disable Docker CLI hints (#555) Create codeql.yml
@ArneTR
Copy link
Member Author

ArneTR commented Dec 22, 2023

Sometimes the time is unknown as the machine has not yet come up to info about it's cooldown time or we have no runs in the history.

In both cases I just issue "Unknown".
Screenshot 2023-12-22 at 4 41 29 PM

Awaiting tests to run, then will put it online

@github-actions
Copy link

Eco-CI Output:

Label 🖥 avg. CPU utilization [%] 🔋 Total Energy [Joules] 🔌 avg. Power [Watts] Duration [Seconds]
Total Run 10.2819 1701.95 2.55166 675
Measurement #1 10.3315 1701.95 2.55166 669

📈 Energy graph:

 7.78 ┤ ╭╮ ╭╮ 7.18 ┤ ││ ││ 6.57 ┤ ││ ╭╮ ││ 5.97 ┤ ││ ││ ││╭╮ 5.37 ┤ │╰╮ ╭╯╰╮ ╭╯│││ 4.77 ┤ ╭─╮ ╭╮ ╭╯ ╰╮╭─╮╭╮ ╭╯ ╰╮│ ╰╯│ 4.17 ┤ ╭─╯ │ ╭────╮ ╭─╮ │╰╮ │ ╰╯ ╰╯╰╮ │ ╰╯ │ ╭╮ ╭╮ ╭╮ ╭────╮ ╭╮ ╭─╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭╮ ╭─ 3.57 ┤ ╭──╯ ╰──╯ ╰─────╯ ╰────╯ │╭╮╭╯ │ │ ╰╮ │╰───╯╰╮ ╭──╮ ╭╮ ╭─────────╯╰─────╮ ╭──╮ ╭╮╭─╮ ╭───╮ ╭╮╭─╮ ╭────╮╭─╯ ╰─────╮╭────────╮ ╭──╮ ╭──╮ ╭──╮ ╭──╮ │╰─╮ ╭╯ ╰╮ ╭╮ ╭╮ ╭─╯╰─╮ ╭╮ ╭╯╰─╮ ╭╮ ╭╮ ╭╯╰─╮ ╭╮╭╮ ╭╮ ╭╯╰─╮ ╭╮ ╭╮╭─╯╰─╮ ╭─╮ ╭╮ ╭╯╰─╮ ╭─╮ ╭╮ ╭───╮ ╭─╮ ╭╮╭─╮ ╭─╮ ╭─╮ ││╭─╮ ╭──╮╭╮ ╭──╮ ╭╮ ╭╮╭─╮ ╭╮ ╭───╯╰──╮ ╭─╮ ╭╮ │ 2.97 ┤ ╭╯ ╰╯││ ╰╮ │ │ │ ╰╮ │ │ ││ │ │ │ │ │╰╯ │ │ │ │╰╯ │ ╭╯ ╰╯ ││ │ │ │ │ │ │ │ │ │ │ │ │ ╰╮ ││ ││ │ │ ╭╮ ││ │ │ ││ ││ │ │ ╭╯│││ ││ │ │ ││╭╮ ╭╯││ │ ╭╯ │ ││ │ │ │ ╰╮ ││ │ │ │ │ │╰╯ │ │ │ ╭╯ │ │╰╯ │ │ │││ │ │ ││ │╰╯ │ ││ │ │ │ │ ││ │ 2.37 ┤ │ ││ │ │ │ │ │ ╭╯ │ ││ │ │ │ │ │ │ ╭╯ │ │ │ │ ╰╯ │ ╭╯ ╰╮ │ │ │ │ │ │ │ ╰╮ │ │ ││ ││ │ │ ││ ││ │ ╰╮ ││ ╭╯│╭╯ │ │ │││ ╭╯│╭╯ │ ╭╯│││ │ ││ │ ╭─╯ │ ││ │ ╰╮ ╭╯ ╰╮ │╰╮│ │ │ ╰╮ │ │ │ │ │ │ │ │ ╭─╯ │││ ╭╮ │ │ ╭╯│╭╮ │ │ ╭─╯│╭╮ ╭╮│ │ ╭╯ ╰╮ ││ │ 1.77 ┼────╯ ╰╯ ╰─╯ ╰───────╯ ╰───────╯ ╰────────╯╰─╯ ╰─────────╯ ╰─────────╯ ╰───────╯ ╰────────╯ ╰────────╯ ╰────────╯ ╰────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰────────╯ ╰────────╯╰────────╯╰─╯ ╰────────╯╰────────╯╰─╯ ╰────────╯╰───────╯ ╰╯ ╰────────╯ ╰╯╰───────╯ ╰╯ ╰────────╯ ╰╯╰───────╯ ╰╯ ╰───────╯ ╰────────╯╰─╯ ╰───────╯ ╰───────╯ ╰╯ ╰─────────╯ ╰────────╯ ╰─────────╯ ╰─────────╯ ╰─────────╯ ╰───────╯ ╰╯╰───────╯╰─╯ ╰────────╯ ╰╯╰─────────╯ ╰───────╯ ╰╯╰───────╯╰╯ ╰───────╯ ╰────────╯╰─╯ Watts over time
@ArneTR ArneTR merged commit 7f416e4 into main Dec 22, 2023
@ArneTR ArneTR deleted the status-waiting-time branch December 22, 2023 15:57
ArneTR added a commit that referenced this pull request Dec 23, 2023
* main: Text change Value formatting on status page Normalized URL for machines endpoint Less confusing error messages Status has now a waiting time (#599) Run ID is now accessible even after fail and thus can be sent via ema… (#601)
ArneTR added a commit that referenced this pull request Dec 23, 2023
* main: (26 commits) Disable tinyproxy systemd service (#623) Text change Value formatting on status page Normalized URL for machines endpoint Less confusing error messages Status has now a waiting time (#599) Run ID is now accessible even after fail and thus can be sent via ema… (#601) Switched from cmd to command (#615) Hotfix for check on frequency provider Tests run_until must be guard-claused with cleanup routine (#616) Fix check if stderr is empty (#613) Bump uvicorn[standard] from 0.24.0.post1 to 0.25.0 (#612) Fxing the network provider stderror Branch and filename are now always not null (#602) Adds a more elaborate depends_on test Support reading notes from services (#590) docker build command in tests now checks reason for docker build failure. If it is a permission issue with the cache, it will continue the rest of the workflow (#576) Use depends_on for container startup order (refactored) (#593) Bump psycopg[binary] from 3.1.15 to 3.1.16 (#610) Added powercap info to hardware_info (#609) ...
ArneTR added a commit that referenced this pull request Jan 1, 2024
* main: (23 commits) System check providers running (#619) Stderr is now by default UTF-8 (#624) Refactored kill/killpg mechanism to be unified and actually fail on n… (#625) Command fix. Must be list append Refactorings Moved tinyproxy out of if clause Refactoring for error messages and security fix for path echoing (#636) GMT color via own commit hash (#634) Hotfix for branch not main Non-Blocking starlette body read (#633) Bump fastapi from 0.105.0 to 0.108.0 (#632) Updated XGBoost submodule Bump pydantic from 2.5.2 to 2.5.3 (#628) Added stddev to timeline (#627) Disable tinyproxy systemd service (#623) Text change Value formatting on status page Normalized URL for machines endpoint Less confusing error messages Status has now a waiting time (#599) ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants