[RFC,2/2] recheck: Add a recheck parser for patchwork comments

Message ID 20231027130609.3035396-2-aconole@redhat.com (mailing list archive)
State New
Headers
Series [RFC,1/2] pw_mon: improve command line options |

Commit Message

Aaron Conole Oct. 27, 2023, 1:06 p.m. UTC
  Add a recheck parsing tool that will allow for labs to build a
recheck workflow based on specific recheck labels and projects,
with an associated state machine and querying interface.

The output of the recheck parsing tool is json and can be fed to
jq or other json parsing utilities for better field support.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 pw_mon           | 59 +++++++++++++++++++++++++++++-
 recheck_tool     | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
 series_db_lib.sh | 62 +++++++++++++++++++++++++++++++-
 3 files changed, 212 insertions(+), 2 deletions(-)
 create mode 100755 recheck_tool
  

Comments

Michael Santana Nov. 1, 2023, 4:57 p.m. UTC | #1
On Fri, Oct 27, 2023 at 9:06 AM Aaron Conole <aconole@redhat.com> wrote:
>
> Add a recheck parsing tool that will allow for labs to build a
> recheck workflow based on specific recheck labels and projects,
> with an associated state machine and querying interface.
>
> The output of the recheck parsing tool is json and can be fed to
> jq or other json parsing utilities for better field support.
>
> Signed-off-by: Aaron Conole <aconole@redhat.com>
Thank you Aaron for the patch. It looks like you spent a lot of time on it

Overall I like the patch and dont have major concerns other than the
questions I have made in-line

Thanks!
> ---
>  pw_mon           | 59 +++++++++++++++++++++++++++++-
>  recheck_tool     | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
>  series_db_lib.sh | 62 +++++++++++++++++++++++++++++++-
>  3 files changed, 212 insertions(+), 2 deletions(-)
>  create mode 100755 recheck_tool
>
> diff --git a/pw_mon b/pw_mon
> index 01bdd25..d5ad8f5 100755
> --- a/pw_mon
> +++ b/pw_mon
> @@ -1,6 +1,6 @@
>  #!/bin/sh
>  # SPDX-Identifier: gpl-2.0-or-later
> -# Copyright (C) 2018, Red Hat, Inc.
> +# Copyright (C) 2018-2023 Red Hat, Inc.
>  #
>  # Monitors a project on a patchwork instance for new series submissions
>  # Records the submissions in the series database (and emits them on the
> @@ -44,6 +44,7 @@ if [ "$1" != "" ]; then
>      fi
>  fi
>
> +recheck_filter=""
>
>  while [ "$1" != "" ]; do
>      if echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
> @@ -59,6 +60,13 @@ while [ "$1" != "" ]; do
>          echo "    --add-filter-recheck=filter  Adds a filter to flag that a recheck needs to be done"
>          echo ""
>          exit 0
> +    elif echo "$1" | grep -E ^--add-filter-recheck= >/dev/null 2>&1; then
> +        filter_str=$(echo "$1" | sed s/--add-filter-recheck=//)
> +        recheck_filter="$filter_str $recheck_filter"
> +        shift
> +    else
> +        echo "Uknown option: $1"
s/Uknown/Unknown/
> +        exit 1
>      fi
>  done
>
> @@ -179,7 +187,56 @@ function check_superseded_series() {
>      done
>  }
>
> +function run_recheck() {
> +    local recheck_list=$(echo "$7" | sed -e 's/^Recheck-request: // ' -e 's/,/ /g')
> +
> +    for filter in $recheck_filter; do
> +        for check in $recheck_list; do
> +            if [ "$filter" == "$check" ]; then
> +                insert_recheck_request_if_needed "$1" "$3" "$8" "$check" "$2"
> +            fi
> +        done
> +    done
> +}
> +
> +function check_patch_for_retest_request() {
> +    local patch_url="$1"
> +
> +    local patch_comments_url=$(curl -s "$userpw" "$patch_url" | jq -rc '.comments')
> +    if [ "Xnull" != "X$patch_comments_url" ]; then
> +        local comments_json=$(curl -s "$userpw" "$patch_comments_url")
> +
> +        local seq_end=$(echo "$comments_json" | jq -rc 'length')
> +        if [ "$seq_end" -a $seq_end -gt 0 ]; then
Isnt this just a longer way of saying " if [ $seq_end -gt 0 ] " ?
> +            seq_end=$((seq_end-1))
> +            for comment_id in $(seq 0 $seq_end); do
> +                local recheck_requested=$(echo "$comments_json" | jq -rc ".[$comment_id].content" | grep "^Recheck-request: ")
> +                if [ "X$recheck_requested" != "X" ]; then
> +                    local msgid=$(echo "$comments_json" | jq -rc ".[$comment_id].msgid")
> +                    run_recheck "$pw_instance" "$series_id" "$project" "$url" "$repo" "$branchname" "$recheck_requested" "$msgid"
> +                fi
> +            done
> +        fi
> +    fi
> +}
> +
> +function check_series_needs_retest() {
> +    local pw_instance="$1"
> +
> +    series_get_active_branches "$pw_instance" | while IFS=\| read -r series_id project url repo branchname; do
> +        local patch_comments_url=$(curl -s "$userpw" "$url" | jq -rc '.patches[] | .url')
> +
> +        for patch in $patch_comments_url; do
> +            check_patch_for_retest_request $patch
> +        done
> +    done
> +}
> +
>  check_undownloaded_series "$pw_instance" "$pw_project"
>  check_completed_series "$pw_instance" "$pw_project"
>  check_new_series "$pw_instance" "$pw_project"
>  check_superseded_series "$pw_instance"
> +
> +# check for retest requests after a series is still passing all the
> +# checks above
> +check_series_needs_retest "$pw_instance"
Okay, I am trying to understand what the workflow here is. I think I
understand that this script will automatically go and check the series
for comments that match "Recheck-request:". This string means that
someone asked the bot to recheck the patch/series.

This script will call into insert_recheck_request_if_needed(), which
will do a check in the database to see if we have already done a
recheck to avoid repeating checking the same patches over and over.
This is done by checking the $recheck_msgid value in the database

I like this workflow. The only thing that I do not like is that you
have to check every comment on every patch. That seems like an
expensive operation, but honestly I do not think there is a better way
to accomplish this. So if there is no better way to do it then it's
okay, let's move forward with it



> diff --git a/recheck_tool b/recheck_tool
> new file mode 100755
> index 0000000..f346e1c
> --- /dev/null
> +++ b/recheck_tool
I guess it wasnt very obvious to me. But what is the purpose of this
script? for us to manually add an entry in the database to run a
recheck?
> @@ -0,0 +1,93 @@
> +#!/bin/sh
> +# SPDX-Identifier: gpl-2.0-or-later
> +# Copyright (C) 2023 Red Hat, Inc.
> +#
> +# Licensed under the terms of the GNU General Public License as published
> +# by the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.  You may obtain a copy of the
> +# license at
> +#
> +#    https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
> +#
> +# Unless required by applicable law or agreed to in writing, software
> +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
> +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
> +# License for the specific language governing permissions and limitations
> +# under the License.
> +
> +mode="select"
> +
> +while [ "$1" != "" ]; do
> +    if echo "$1" | grep -E ^--help >/dev/null 2>&1; then
> +        echo "recheck / retest state machine script"
> +        echo ""
> +        echo "$0:"
> +        echo " --pw-project=<proj>:    Patchwork project."
> +        echo " --pw-instance=<inst>:   Patchwork instance."
> +        echo " --filter=<str>:         Job / request for recheck."
> +        echo " --state=<0..>:          Resync state ID."
> +        echo " --msgid=<msgid>:                Message ID to select."
> +        echo " --update:               Set tool in update mode"
> +        echo " --new-state=<0..>:      New state ID to set"
> +        echo " --series-id=<..>:       Series ID"
> +        echo ""
> +        echo "Will spit out a parsable json for each db line when selecting"
> +        exit 0
> +    elif echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
> +        pw_project=$(echo "$1" | sed s/--pw-project=//)
> +    elif echo "$1" | grep -E ^--pw-instance= >/dev/null 2>&1; then
> +        pw_instance=$(echo "$1" | sed s/--pw-instance=//)
> +    elif echo "$1" | grep -E ^--filter= >/dev/null 2>&1; then
> +        filter=$(echo "$1" | sed s/--filter=//)
> +    elif echo "$1" | grep -E ^--state= >/dev/null 2>&1; then
> +        recheck_state=$(echo "$1" | sed s/--state=//)
> +    elif echo "$1" | grep -E ^--msgid= >/dev/null 2>&1; then
> +        message_id=$(echo "$1" | sed s/--msgid=//)
> +    elif echo "$1" | grep -E ^--update >/dev/null 2>&1; then
> +        mode="update"
> +    elif echo "$1" | grep -E ^--new-state= >/dev/null 2>&1; then
> +        new_recheck_state=$(echo "$1" | sed s/--new-state=//)
> +    elif echo "$1" | grep -E ^--series-id= >/dev/null 2>&1; then
> +        series_id=$(echo "$1" | sed s/--series-id=//)
> +    else
> +        echo "unknown option $1"
> +        exit 1
> +    fi
> +    shift
> +done
> +
> +source $(dirname $0)/series_db_lib.sh
> +
> +if [ "$mode" == "select" ]; then
> +    echo "{\"rechecks\":["
> +    for request in $(get_recheck_requests_by_project "$pw_instance" \
> +                                                     "$pw_project" \
> +                                                     "$recheck_state" \
> +                                                     "$filter"); do
> +        message_id=$(echo $request | cut -d\| -f1)
> +        series_id=$(echo $request | cut -d\| -f2)
> +        echo "{\"pw_instance\": \"$pw_instance\", \"series_id\":$series_id, \"msg_id\":\"$message_id\", \"state\":\"$recheck_state\", \"requested\": \"$filter\"}"
> +    done
> +    echo "]}"
> +elif [ "$mode" == "update" ]; then
> +    if [ "X$new_recheck_state" == "X" -o "X$series_id" == "X" ]; then
> +        echo "Need to set a series-id and a new recheck state when updating."
> +        exit 1
> +    fi
> +
> +    request=$(get_recheck_request "$pw_instance" "$pw_project" "$message_id" \
> +                                  "$filter" "$series_id" "$recheck_state")
> +    if [ "X$request" == "X" ]; then
> +        echo "{\"result\":\"notfound\"}"
> +        exit 0
> +    fi
> +
> +    set_recheck_request_state "$pw_instance" "$pw_project" "$message_id" \
> +                              "$filter" "$series_id" "$new_recheck_state"
> +
> +    echo "{\"result\":\"executed\",\"recheck\":{\"pw_instance\": \"$pw_instance\", \"series_id\":$series_id, \"msg_id\":\"$message_id\", \"state\":\"$new_recheck_state\", \"requested\": \"$filter\"}}"
> +else
> +    echo "Uknown state: $mode"
s/Uknown/Unknown/
> +    exit 1
> +fi
> +
> diff --git a/series_db_lib.sh b/series_db_lib.sh
> index 6c2d98e..a729337 100644
> --- a/series_db_lib.sh
> +++ b/series_db_lib.sh
> @@ -1,6 +1,6 @@
>  #!/bin/sh
>  # SPDX-Identifier: gpl-2.0-or-later
> -# Copyright (C) 2018,2019 Red Hat, Inc.
> +# Copyright (C) 2018-2023 Red Hat, Inc.
>  #
>  # Licensed under the terms of the GNU General Public License as published
>  # by the Free Software Foundation; either version 2 of the License, or
> @@ -114,6 +114,21 @@ EOF
>          run_db_command "INSERT INTO series_schema_version(id) values (7);"
>      fi
>
> +    run_db_command "select * from series_schema_version;" | egrep '^8$' > /dev/null 2>&1
> +    if [ $? -eq 1 ]; then
> +        sqlite3 ${HOME}/.series-db <<EOF
> +CREATE TABLE recheck_requests (
> +recheck_id INTEGER,
> +recheck_message_id STRING,
> +recheck_requested_by STRING,
> +recheck_series STRING,
> +patchwork_instance STRING,
> +patchwork_project STRING,
> +recheck_sync INTEGER
> +);
> +EOF
> +        run_db_command "INSERT INTO series_schema_version(id) values (8);"
> +    fi
>  }
>
>  function series_db_exists() {
> @@ -390,3 +405,48 @@ function get_patch_id_by_series_id_and_sha() {
>
>      echo "select patch_id from git_builds where patchwork_instance=\"$instance\" and series_id=$series_id and sha=\"$sha\";" | series_db_execute
>  }
> +
> +function get_recheck_requests_by_project() {
> +    local recheck_instance="$1"
> +    local recheck_project="$2"
> +    local recheck_state="$3"
> +    local recheck_requested_by="$4"
> +
> +    series_db_exists
> +
> +    echo "select recheck_message_id,recheck_series from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_sync=$recheck_state and recheck_requested_by=\"$recheck_requested_by\";" | series_db_execute
> +}
> +
> +function insert_recheck_request_if_needed() {
> +    local recheck_instance="$1"
> +    local recheck_project="$2"
> +    local recheck_msgid="$3"
> +    local recheck_requested_by="$4"
> +    local recheck_series="$5"
> +
> +    if ! echo "select * from recheck_requests where recheck_message_id=\"$recheck_msgid\";" | series_db_execute | grep $recheck_msgid >/dev/null 2>&1; then
> +        echo "INSERT INTO recheck_requests (recheck_message_id, recheck_requested_by, recheck_series, patchwork_instance, patchwork_project, recheck_sync) values (\"$recheck_msgid\", \"$recheck_requested_by\", \"$recheck_series\", \"$recheck_instance\", \"$recheck_project\", 0);" | series_db_execute
> +    fi
> +}
> +
> +function get_recheck_request() {
> +    local recheck_instance="$1"
> +    local recheck_project="$2"
> +    local recheck_msgid="$3"
> +    local recheck_requested_by="$4"
> +    local recheck_series="$5"
> +    local recheck_state="$6"
> +
> +    echo "select * from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\" and recheck_message_id=\"$recheck_msgid\" and recheck_sync=$recheck_state;" | series_db_execute
> +}
> +
> +function set_recheck_request_state() {
> +    local recheck_instance="$1"
> +    local recheck_project="$2"
> +    local recheck_msgid="$3"
> +    local recheck_requested_by="$4"
> +    local recheck_series="$5"
> +    local recheck_state="$6"
> +
> +    echo "UPDATE recheck_requests set recheck_sync=$recheck_state where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\";" | series_db_execute
> +}
> --
> 2.41.0
>
  
Aaron Conole Nov. 1, 2023, 7:16 p.m. UTC | #2
Michael Santana <msantana@redhat.com> writes:

> On Fri, Oct 27, 2023 at 9:06 AM Aaron Conole <aconole@redhat.com> wrote:
>>
>> Add a recheck parsing tool that will allow for labs to build a
>> recheck workflow based on specific recheck labels and projects,
>> with an associated state machine and querying interface.
>>
>> The output of the recheck parsing tool is json and can be fed to
>> jq or other json parsing utilities for better field support.
>>
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
> Thank you Aaron for the patch. It looks like you spent a lot of time on it
>
> Overall I like the patch and dont have major concerns other than the
> questions I have made in-line
>
> Thanks!
>> ---
>>  pw_mon           | 59 +++++++++++++++++++++++++++++-
>>  recheck_tool     | 93 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  series_db_lib.sh | 62 +++++++++++++++++++++++++++++++-
>>  3 files changed, 212 insertions(+), 2 deletions(-)
>>  create mode 100755 recheck_tool
>>
>> diff --git a/pw_mon b/pw_mon
>> index 01bdd25..d5ad8f5 100755
>> --- a/pw_mon
>> +++ b/pw_mon
>> @@ -1,6 +1,6 @@
>>  #!/bin/sh
>>  # SPDX-Identifier: gpl-2.0-or-later
>> -# Copyright (C) 2018, Red Hat, Inc.
>> +# Copyright (C) 2018-2023 Red Hat, Inc.
>>  #
>>  # Monitors a project on a patchwork instance for new series submissions
>>  # Records the submissions in the series database (and emits them on the
>> @@ -44,6 +44,7 @@ if [ "$1" != "" ]; then
>>      fi
>>  fi
>>
>> +recheck_filter=""
>>
>>  while [ "$1" != "" ]; do
>>      if echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
>> @@ -59,6 +60,13 @@ while [ "$1" != "" ]; do
>>          echo "    --add-filter-recheck=filter  Adds a filter to flag that a recheck needs to be done"
>>          echo ""
>>          exit 0
>> +    elif echo "$1" | grep -E ^--add-filter-recheck= >/dev/null 2>&1; then
>> +        filter_str=$(echo "$1" | sed s/--add-filter-recheck=//)
>> +        recheck_filter="$filter_str $recheck_filter"
>> +        shift
>> +    else
>> +        echo "Uknown option: $1"
> s/Uknown/Unknown/

d'oh - will fix these.

>> +        exit 1
>>      fi
>>  done
>>
>> @@ -179,7 +187,56 @@ function check_superseded_series() {
>>      done
>>  }
>>
>> +function run_recheck() {
>> +    local recheck_list=$(echo "$7" | sed -e 's/^Recheck-request: // ' -e 's/,/ /g')
>> +
>> +    for filter in $recheck_filter; do
>> +        for check in $recheck_list; do
>> +            if [ "$filter" == "$check" ]; then
>> +                insert_recheck_request_if_needed "$1" "$3" "$8" "$check" "$2"
>> +            fi
>> +        done
>> +    done
>> +}
>> +
>> +function check_patch_for_retest_request() {
>> +    local patch_url="$1"
>> +
>> +    local patch_comments_url=$(curl -s "$userpw" "$patch_url" | jq -rc '.comments')
>> +    if [ "Xnull" != "X$patch_comments_url" ]; then
>> +        local comments_json=$(curl -s "$userpw" "$patch_comments_url")
>> +
>> +        local seq_end=$(echo "$comments_json" | jq -rc 'length')
>> +        if [ "$seq_end" -a $seq_end -gt 0 ]; then
> Isnt this just a longer way of saying " if [ $seq_end -gt 0 ] " ?

No - in case the jq fails (lets say webserver had an issue), the seq_end
value will be empty.  That would cause error if we try to run '-gt'
operation.  We avoid it by testing that there is at least something
there.

>> +            seq_end=$((seq_end-1))
>> +            for comment_id in $(seq 0 $seq_end); do
>> + local recheck_requested=$(echo "$comments_json" | jq -rc
>> ".[$comment_id].content" | grep "^Recheck-request: ")
>> +                if [ "X$recheck_requested" != "X" ]; then
>> +                    local msgid=$(echo "$comments_json" | jq -rc ".[$comment_id].msgid")
>> + run_recheck "$pw_instance" "$series_id" "$project" "$url" "$repo"
>> "$branchname" "$recheck_requested" "$msgid"
>> +                fi
>> +            done
>> +        fi
>> +    fi
>> +}
>> +
>> +function check_series_needs_retest() {
>> +    local pw_instance="$1"
>> +
>> + series_get_active_branches "$pw_instance" | while IFS=\| read -r
>> series_id project url repo branchname; do
>> +        local patch_comments_url=$(curl -s "$userpw" "$url" | jq -rc '.patches[] | .url')
>> +
>> +        for patch in $patch_comments_url; do
>> +            check_patch_for_retest_request $patch
>> +        done
>> +    done
>> +}
>> +
>>  check_undownloaded_series "$pw_instance" "$pw_project"
>>  check_completed_series "$pw_instance" "$pw_project"
>>  check_new_series "$pw_instance" "$pw_project"
>>  check_superseded_series "$pw_instance"
>> +
>> +# check for retest requests after a series is still passing all the
>> +# checks above
>> +check_series_needs_retest "$pw_instance"
> Okay, I am trying to understand what the workflow here is. I think I
> understand that this script will automatically go and check the series
> for comments that match "Recheck-request:". This string means that
> someone asked the bot to recheck the patch/series.

Correct - if someone sends something like:

 Recheck-request: a,b,c,d

and this pw_mon script is called with filters for 'a' and 'b', it will
flag those patches in the DB.

> This script will call into insert_recheck_request_if_needed(), which
> will do a check in the database to see if we have already done a
> recheck to avoid repeating checking the same patches over and over.
> This is done by checking the $recheck_msgid value in the database

Correct.

> I like this workflow. The only thing that I do not like is that you
> have to check every comment on every patch. That seems like an
> expensive operation, but honestly I do not think there is a better way
> to accomplish this. So if there is no better way to do it then it's
> okay, let's move forward with it

There isn't a different way to do it for now, but I hope to switch to
using the events API which should mean we only look at the most recent
events that come in.

>> diff --git a/recheck_tool b/recheck_tool
>> new file mode 100755
>> index 0000000..f346e1c
>> --- /dev/null
>> +++ b/recheck_tool
> I guess it wasnt very obvious to me. But what is the purpose of this
> script? for us to manually add an entry in the database to run a
> recheck?

No - this script cannot insert an entry.  However, it can modify
existing entries.  This lets us define a set of transitions for each
operation that might need to happen for every type that exists.  For
instance, in github, we need to kick off the new testing, then we need
to monitor for the new results, then we need to report the new results.

These could be represented by individual states.  But, for example, if
there's a future where we add an option to re-apply to a new tree or
something, we can insert a state to handle that.

>> @@ -0,0 +1,93 @@
>> +#!/bin/sh
>> +# SPDX-Identifier: gpl-2.0-or-later
>> +# Copyright (C) 2023 Red Hat, Inc.
>> +#
>> +# Licensed under the terms of the GNU General Public License as published
>> +# by the Free Software Foundation; either version 2 of the License, or
>> +# (at your option) any later version.  You may obtain a copy of the
>> +# license at
>> +#
>> +#    https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
>> +#
>> +# Unless required by applicable law or agreed to in writing, software
>> +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
>> +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
>> +# License for the specific language governing permissions and limitations
>> +# under the License.
>> +
>> +mode="select"
>> +
>> +while [ "$1" != "" ]; do
>> +    if echo "$1" | grep -E ^--help >/dev/null 2>&1; then
>> +        echo "recheck / retest state machine script"
>> +        echo ""
>> +        echo "$0:"
>> +        echo " --pw-project=<proj>:    Patchwork project."
>> +        echo " --pw-instance=<inst>:   Patchwork instance."
>> +        echo " --filter=<str>:         Job / request for recheck."
>> +        echo " --state=<0..>:          Resync state ID."
>> +        echo " --msgid=<msgid>:                Message ID to select."
>> +        echo " --update:               Set tool in update mode"
>> +        echo " --new-state=<0..>:      New state ID to set"
>> +        echo " --series-id=<..>:       Series ID"
>> +        echo ""
>> +        echo "Will spit out a parsable json for each db line when selecting"
>> +        exit 0
>> +    elif echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
>> +        pw_project=$(echo "$1" | sed s/--pw-project=//)
>> +    elif echo "$1" | grep -E ^--pw-instance= >/dev/null 2>&1; then
>> +        pw_instance=$(echo "$1" | sed s/--pw-instance=//)
>> +    elif echo "$1" | grep -E ^--filter= >/dev/null 2>&1; then
>> +        filter=$(echo "$1" | sed s/--filter=//)
>> +    elif echo "$1" | grep -E ^--state= >/dev/null 2>&1; then
>> +        recheck_state=$(echo "$1" | sed s/--state=//)
>> +    elif echo "$1" | grep -E ^--msgid= >/dev/null 2>&1; then
>> +        message_id=$(echo "$1" | sed s/--msgid=//)
>> +    elif echo "$1" | grep -E ^--update >/dev/null 2>&1; then
>> +        mode="update"
>> +    elif echo "$1" | grep -E ^--new-state= >/dev/null 2>&1; then
>> +        new_recheck_state=$(echo "$1" | sed s/--new-state=//)
>> +    elif echo "$1" | grep -E ^--series-id= >/dev/null 2>&1; then
>> +        series_id=$(echo "$1" | sed s/--series-id=//)
>> +    else
>> +        echo "unknown option $1"
>> +        exit 1
>> +    fi
>> +    shift
>> +done
>> +
>> +source $(dirname $0)/series_db_lib.sh
>> +
>> +if [ "$mode" == "select" ]; then
>> +    echo "{\"rechecks\":["
>> +    for request in $(get_recheck_requests_by_project "$pw_instance" \
>> +                                                     "$pw_project" \
>> +                                                     "$recheck_state" \
>> +                                                     "$filter"); do
>> +        message_id=$(echo $request | cut -d\| -f1)
>> +        series_id=$(echo $request | cut -d\| -f2)
>> + echo "{\"pw_instance\": \"$pw_instance\",
>> \"series_id\":$series_id, \"msg_id\":\"$message_id\",
>> \"state\":\"$recheck_state\", \"requested\": \"$filter\"}"
>> +    done
>> +    echo "]}"
>> +elif [ "$mode" == "update" ]; then
>> +    if [ "X$new_recheck_state" == "X" -o "X$series_id" == "X" ]; then
>> +        echo "Need to set a series-id and a new recheck state when updating."
>> +        exit 1
>> +    fi
>> +
>> +    request=$(get_recheck_request "$pw_instance" "$pw_project" "$message_id" \
>> +                                  "$filter" "$series_id" "$recheck_state")
>> +    if [ "X$request" == "X" ]; then
>> +        echo "{\"result\":\"notfound\"}"
>> +        exit 0
>> +    fi
>> +
>> +    set_recheck_request_state "$pw_instance" "$pw_project" "$message_id" \
>> +                              "$filter" "$series_id" "$new_recheck_state"
>> +
>> + echo "{\"result\":\"executed\",\"recheck\":{\"pw_instance\":
>> \"$pw_instance\", \"series_id\":$series_id,
>> \"msg_id\":\"$message_id\", \"state\":\"$new_recheck_state\",
>> \"requested\": \"$filter\"}}"
>> +else
>> +    echo "Uknown state: $mode"
> s/Uknown/Unknown/

ACK

>> +    exit 1
>> +fi
>> +
>> diff --git a/series_db_lib.sh b/series_db_lib.sh
>> index 6c2d98e..a729337 100644
>> --- a/series_db_lib.sh
>> +++ b/series_db_lib.sh
>> @@ -1,6 +1,6 @@
>>  #!/bin/sh
>>  # SPDX-Identifier: gpl-2.0-or-later
>> -# Copyright (C) 2018,2019 Red Hat, Inc.
>> +# Copyright (C) 2018-2023 Red Hat, Inc.
>>  #
>>  # Licensed under the terms of the GNU General Public License as published
>>  # by the Free Software Foundation; either version 2 of the License, or
>> @@ -114,6 +114,21 @@ EOF
>>          run_db_command "INSERT INTO series_schema_version(id) values (7);"
>>      fi
>>
>> +    run_db_command "select * from series_schema_version;" | egrep '^8$' > /dev/null 2>&1
>> +    if [ $? -eq 1 ]; then
>> +        sqlite3 ${HOME}/.series-db <<EOF
>> +CREATE TABLE recheck_requests (
>> +recheck_id INTEGER,
>> +recheck_message_id STRING,
>> +recheck_requested_by STRING,
>> +recheck_series STRING,
>> +patchwork_instance STRING,
>> +patchwork_project STRING,
>> +recheck_sync INTEGER
>> +);
>> +EOF
>> +        run_db_command "INSERT INTO series_schema_version(id) values (8);"
>> +    fi
>>  }
>>
>>  function series_db_exists() {
>> @@ -390,3 +405,48 @@ function get_patch_id_by_series_id_and_sha() {
>>
>>      echo "select patch_id from git_builds where patchwork_instance=\"$instance\" and series_id=$series_id and sha=\"$sha\";" | series_db_execute
>>  }
>> +
>> +function get_recheck_requests_by_project() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_state="$3"
>> +    local recheck_requested_by="$4"
>> +
>> +    series_db_exists
>> +
>> +    echo "select recheck_message_id,recheck_series from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_sync=$recheck_state and recheck_requested_by=\"$recheck_requested_by\";" | series_db_execute
>> +}
>> +
>> +function insert_recheck_request_if_needed() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_msgid="$3"
>> +    local recheck_requested_by="$4"
>> +    local recheck_series="$5"
>> +
>> +    if ! echo "select * from recheck_requests where recheck_message_id=\"$recheck_msgid\";" | series_db_execute | grep $recheck_msgid >/dev/null 2>&1; then
>> +        echo "INSERT INTO recheck_requests (recheck_message_id, recheck_requested_by, recheck_series, patchwork_instance, patchwork_project, recheck_sync) values (\"$recheck_msgid\", \"$recheck_requested_by\", \"$recheck_series\", \"$recheck_instance\", \"$recheck_project\", 0);" | series_db_execute
>> +    fi
>> +}
>> +
>> +function get_recheck_request() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_msgid="$3"
>> +    local recheck_requested_by="$4"
>> +    local recheck_series="$5"
>> +    local recheck_state="$6"
>> +
>> +    echo "select * from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\" and recheck_message_id=\"$recheck_msgid\" and recheck_sync=$recheck_state;" | series_db_execute
>> +}
>> +
>> +function set_recheck_request_state() {
>> +    local recheck_instance="$1"
>> +    local recheck_project="$2"
>> +    local recheck_msgid="$3"
>> +    local recheck_requested_by="$4"
>> +    local recheck_series="$5"
>> +    local recheck_state="$6"
>> +
>> +    echo "UPDATE recheck_requests set recheck_sync=$recheck_state where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\";" | series_db_execute
>> +}
>> --
>> 2.41.0
>>
  
Thomas Monjalon Nov. 2, 2023, 10:44 a.m. UTC | #3
01/11/2023 20:16, Aaron Conole:
> Michael Santana <msantana@redhat.com> writes:
> > I like this workflow. The only thing that I do not like is that you
> > have to check every comment on every patch. That seems like an
> > expensive operation, but honestly I do not think there is a better way
> > to accomplish this. So if there is no better way to do it then it's
> > okay, let's move forward with it
> 
> There isn't a different way to do it for now, but I hope to switch to
> using the events API which should mean we only look at the most recent
> events that come in.

What prevent us to use the events API?
  
Aaron Conole Nov. 2, 2023, 1:03 p.m. UTC | #4
Thomas Monjalon <thomas@monjalon.net> writes:

> 01/11/2023 20:16, Aaron Conole:
>> Michael Santana <msantana@redhat.com> writes:
>> > I like this workflow. The only thing that I do not like is that you
>> > have to check every comment on every patch. That seems like an
>> > expensive operation, but honestly I do not think there is a better way
>> > to accomplish this. So if there is no better way to do it then it's
>> > okay, let's move forward with it
>> 
>> There isn't a different way to do it for now, but I hope to switch to
>> using the events API which should mean we only look at the most recent
>> events that come in.
>
> What prevent us to use the events API?

Ideally we could use it everywhere, but the pw-ci project is used for
other patchwork instances.  Events API for comments is a recent change,
and not every patchwork instance is upgraded to support it (for example,
both ozlabs and kernel.org patchwork instances don't have support).

I do have some detection code, and am planning on hooking that up so
that we can detect whether events API supports comment events based on
the filters offered, but that takes some time to test and validate.

So it becomes a question of which is more important - having something
working now, or spending time with the detection code.  Either way, we
need it for older patchworks that haven't upgraded to the just released
version (some projects are still on 2.2.0).

If you think it is better to do the events path first, I can go with
that but then we severely limit which projects get support for rechecks,
and I've already gotten the feature request for both OVS and OVN - so
we'd either need to support comments polling anyway, or do the massive
work of upgrading ozlabs instance.
  
Thomas Monjalon Nov. 2, 2023, 1:32 p.m. UTC | #5
02/11/2023 14:03, Aaron Conole:
> Thomas Monjalon <thomas@monjalon.net> writes:
> 
> > 01/11/2023 20:16, Aaron Conole:
> >> Michael Santana <msantana@redhat.com> writes:
> >> > I like this workflow. The only thing that I do not like is that you
> >> > have to check every comment on every patch. That seems like an
> >> > expensive operation, but honestly I do not think there is a better way
> >> > to accomplish this. So if there is no better way to do it then it's
> >> > okay, let's move forward with it
> >> 
> >> There isn't a different way to do it for now, but I hope to switch to
> >> using the events API which should mean we only look at the most recent
> >> events that come in.
> >
> > What prevent us to use the events API?
> 
> Ideally we could use it everywhere, but the pw-ci project is used for
> other patchwork instances.  Events API for comments is a recent change,
> and not every patchwork instance is upgraded to support it (for example,
> both ozlabs and kernel.org patchwork instances don't have support).
> 
> I do have some detection code, and am planning on hooking that up so
> that we can detect whether events API supports comment events based on
> the filters offered, but that takes some time to test and validate.
> 
> So it becomes a question of which is more important - having something
> working now, or spending time with the detection code.  Either way, we
> need it for older patchworks that haven't upgraded to the just released
> version (some projects are still on 2.2.0).

I agree better to work on general availability first.

> If you think it is better to do the events path first, I can go with
> that but then we severely limit which projects get support for rechecks,
> and I've already gotten the feature request for both OVS and OVN - so
> we'd either need to support comments polling anyway, or do the massive
> work of upgrading ozlabs instance.

Events API is an optimization. It can come later.
  

Patch

diff --git a/pw_mon b/pw_mon
index 01bdd25..d5ad8f5 100755
--- a/pw_mon
+++ b/pw_mon
@@ -1,6 +1,6 @@ 
 #!/bin/sh
 # SPDX-Identifier: gpl-2.0-or-later
-# Copyright (C) 2018, Red Hat, Inc.
+# Copyright (C) 2018-2023 Red Hat, Inc.
 #
 # Monitors a project on a patchwork instance for new series submissions
 # Records the submissions in the series database (and emits them on the
@@ -44,6 +44,7 @@  if [ "$1" != "" ]; then
     fi
 fi
 
+recheck_filter=""
 
 while [ "$1" != "" ]; do
     if echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
@@ -59,6 +60,13 @@  while [ "$1" != "" ]; do
         echo "    --add-filter-recheck=filter	Adds a filter to flag that a recheck needs to be done"
         echo ""
         exit 0
+    elif echo "$1" | grep -E ^--add-filter-recheck= >/dev/null 2>&1; then
+        filter_str=$(echo "$1" | sed s/--add-filter-recheck=//)
+        recheck_filter="$filter_str $recheck_filter"
+        shift
+    else
+        echo "Uknown option: $1"
+        exit 1
     fi
 done
 
@@ -179,7 +187,56 @@  function check_superseded_series() {
     done
 }
 
+function run_recheck() {
+    local recheck_list=$(echo "$7" | sed -e 's/^Recheck-request: // ' -e 's/,/ /g')
+
+    for filter in $recheck_filter; do
+        for check in $recheck_list; do
+            if [ "$filter" == "$check" ]; then
+                insert_recheck_request_if_needed "$1" "$3" "$8" "$check" "$2"
+            fi
+        done
+    done
+}
+
+function check_patch_for_retest_request() {
+    local patch_url="$1"
+
+    local patch_comments_url=$(curl -s "$userpw" "$patch_url" | jq -rc '.comments')
+    if [ "Xnull" != "X$patch_comments_url" ]; then
+        local comments_json=$(curl -s "$userpw" "$patch_comments_url")
+
+        local seq_end=$(echo "$comments_json" | jq -rc 'length')
+        if [ "$seq_end" -a $seq_end -gt 0 ]; then
+            seq_end=$((seq_end-1))
+            for comment_id in $(seq 0 $seq_end); do
+                local recheck_requested=$(echo "$comments_json" | jq -rc ".[$comment_id].content" | grep "^Recheck-request: ")
+                if [ "X$recheck_requested" != "X" ]; then
+                    local msgid=$(echo "$comments_json" | jq -rc ".[$comment_id].msgid")
+                    run_recheck "$pw_instance" "$series_id" "$project" "$url" "$repo" "$branchname" "$recheck_requested" "$msgid"
+                fi
+            done
+        fi
+    fi
+}
+
+function check_series_needs_retest() {
+    local pw_instance="$1"
+
+    series_get_active_branches "$pw_instance" | while IFS=\| read -r series_id project url repo branchname; do
+        local patch_comments_url=$(curl -s "$userpw" "$url" | jq -rc '.patches[] | .url')
+
+        for patch in $patch_comments_url; do
+            check_patch_for_retest_request $patch
+        done
+    done
+}
+
 check_undownloaded_series "$pw_instance" "$pw_project"
 check_completed_series "$pw_instance" "$pw_project"
 check_new_series "$pw_instance" "$pw_project"
 check_superseded_series "$pw_instance"
+
+# check for retest requests after a series is still passing all the
+# checks above
+check_series_needs_retest "$pw_instance"
diff --git a/recheck_tool b/recheck_tool
new file mode 100755
index 0000000..f346e1c
--- /dev/null
+++ b/recheck_tool
@@ -0,0 +1,93 @@ 
+#!/bin/sh
+# SPDX-Identifier: gpl-2.0-or-later
+# Copyright (C) 2023 Red Hat, Inc.
+#
+# Licensed under the terms of the GNU General Public License as published
+# by the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.  You may obtain a copy of the
+# license at
+#
+#    https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+
+mode="select"
+
+while [ "$1" != "" ]; do
+    if echo "$1" | grep -E ^--help >/dev/null 2>&1; then
+        echo "recheck / retest state machine script"
+        echo ""
+        echo "$0:"
+        echo " --pw-project=<proj>:	Patchwork project."
+        echo " --pw-instance=<inst>:	Patchwork instance."
+        echo " --filter=<str>:		Job / request for recheck."
+        echo " --state=<0..>:		Resync state ID."
+        echo " --msgid=<msgid>:		Message ID to select."
+        echo " --update:		Set tool in update mode"
+        echo " --new-state=<0..>:	New state ID to set"
+        echo " --series-id=<..>:	Series ID"
+        echo ""
+        echo "Will spit out a parsable json for each db line when selecting"
+        exit 0
+    elif echo "$1" | grep -E ^--pw-project= >/dev/null 2>&1; then
+        pw_project=$(echo "$1" | sed s/--pw-project=//)
+    elif echo "$1" | grep -E ^--pw-instance= >/dev/null 2>&1; then
+        pw_instance=$(echo "$1" | sed s/--pw-instance=//)
+    elif echo "$1" | grep -E ^--filter= >/dev/null 2>&1; then
+        filter=$(echo "$1" | sed s/--filter=//)
+    elif echo "$1" | grep -E ^--state= >/dev/null 2>&1; then
+        recheck_state=$(echo "$1" | sed s/--state=//)
+    elif echo "$1" | grep -E ^--msgid= >/dev/null 2>&1; then
+        message_id=$(echo "$1" | sed s/--msgid=//)
+    elif echo "$1" | grep -E ^--update >/dev/null 2>&1; then
+        mode="update"
+    elif echo "$1" | grep -E ^--new-state= >/dev/null 2>&1; then
+        new_recheck_state=$(echo "$1" | sed s/--new-state=//)
+    elif echo "$1" | grep -E ^--series-id= >/dev/null 2>&1; then
+        series_id=$(echo "$1" | sed s/--series-id=//)
+    else
+        echo "unknown option $1"
+        exit 1
+    fi
+    shift
+done
+
+source $(dirname $0)/series_db_lib.sh
+
+if [ "$mode" == "select" ]; then
+    echo "{\"rechecks\":["
+    for request in $(get_recheck_requests_by_project "$pw_instance" \
+                                                     "$pw_project" \
+                                                     "$recheck_state" \
+                                                     "$filter"); do
+        message_id=$(echo $request | cut -d\| -f1)
+        series_id=$(echo $request | cut -d\| -f2)
+        echo "{\"pw_instance\": \"$pw_instance\", \"series_id\":$series_id, \"msg_id\":\"$message_id\", \"state\":\"$recheck_state\", \"requested\": \"$filter\"}"
+    done
+    echo "]}"
+elif [ "$mode" == "update" ]; then
+    if [ "X$new_recheck_state" == "X" -o "X$series_id" == "X" ]; then
+        echo "Need to set a series-id and a new recheck state when updating."
+        exit 1
+    fi
+
+    request=$(get_recheck_request "$pw_instance" "$pw_project" "$message_id" \
+                                  "$filter" "$series_id" "$recheck_state")
+    if [ "X$request" == "X" ]; then
+        echo "{\"result\":\"notfound\"}"
+        exit 0
+    fi
+
+    set_recheck_request_state "$pw_instance" "$pw_project" "$message_id" \
+                              "$filter" "$series_id" "$new_recheck_state"
+
+    echo "{\"result\":\"executed\",\"recheck\":{\"pw_instance\": \"$pw_instance\", \"series_id\":$series_id, \"msg_id\":\"$message_id\", \"state\":\"$new_recheck_state\", \"requested\": \"$filter\"}}"
+else
+    echo "Uknown state: $mode"
+    exit 1
+fi
+
diff --git a/series_db_lib.sh b/series_db_lib.sh
index 6c2d98e..a729337 100644
--- a/series_db_lib.sh
+++ b/series_db_lib.sh
@@ -1,6 +1,6 @@ 
 #!/bin/sh
 # SPDX-Identifier: gpl-2.0-or-later
-# Copyright (C) 2018,2019 Red Hat, Inc.
+# Copyright (C) 2018-2023 Red Hat, Inc.
 #
 # Licensed under the terms of the GNU General Public License as published
 # by the Free Software Foundation; either version 2 of the License, or
@@ -114,6 +114,21 @@  EOF
         run_db_command "INSERT INTO series_schema_version(id) values (7);"
     fi
 
+    run_db_command "select * from series_schema_version;" | egrep '^8$' > /dev/null 2>&1
+    if [ $? -eq 1 ]; then
+        sqlite3 ${HOME}/.series-db <<EOF
+CREATE TABLE recheck_requests (
+recheck_id INTEGER,
+recheck_message_id STRING,
+recheck_requested_by STRING,
+recheck_series STRING,
+patchwork_instance STRING,
+patchwork_project STRING,
+recheck_sync INTEGER
+);
+EOF
+        run_db_command "INSERT INTO series_schema_version(id) values (8);"
+    fi
 }
 
 function series_db_exists() {
@@ -390,3 +405,48 @@  function get_patch_id_by_series_id_and_sha() {
 
     echo "select patch_id from git_builds where patchwork_instance=\"$instance\" and series_id=$series_id and sha=\"$sha\";" | series_db_execute
 }
+
+function get_recheck_requests_by_project() {
+    local recheck_instance="$1"
+    local recheck_project="$2"
+    local recheck_state="$3"
+    local recheck_requested_by="$4"
+
+    series_db_exists
+
+    echo "select recheck_message_id,recheck_series from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_sync=$recheck_state and recheck_requested_by=\"$recheck_requested_by\";" | series_db_execute
+}
+
+function insert_recheck_request_if_needed() {
+    local recheck_instance="$1"
+    local recheck_project="$2"
+    local recheck_msgid="$3"
+    local recheck_requested_by="$4"
+    local recheck_series="$5"
+
+    if ! echo "select * from recheck_requests where recheck_message_id=\"$recheck_msgid\";" | series_db_execute | grep $recheck_msgid >/dev/null 2>&1; then
+        echo "INSERT INTO recheck_requests (recheck_message_id, recheck_requested_by, recheck_series, patchwork_instance, patchwork_project, recheck_sync) values (\"$recheck_msgid\", \"$recheck_requested_by\", \"$recheck_series\", \"$recheck_instance\", \"$recheck_project\", 0);" | series_db_execute
+    fi
+}
+
+function get_recheck_request() {
+    local recheck_instance="$1"
+    local recheck_project="$2"
+    local recheck_msgid="$3"
+    local recheck_requested_by="$4"
+    local recheck_series="$5"
+    local recheck_state="$6"
+
+    echo "select * from recheck_requests where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\" and recheck_message_id=\"$recheck_msgid\" and recheck_sync=$recheck_state;" | series_db_execute
+}
+
+function set_recheck_request_state() {
+    local recheck_instance="$1"
+    local recheck_project="$2"
+    local recheck_msgid="$3"
+    local recheck_requested_by="$4"
+    local recheck_series="$5"
+    local recheck_state="$6"
+
+    echo "UPDATE recheck_requests set recheck_sync=$recheck_state where patchwork_instance=\"$recheck_instance\" and patchwork_project=\"$recheck_project\" and recheck_requested_by=\"$recheck_requested_by\" and recheck_series=\"$recheck_series\";" | series_db_execute
+}