[lustre-devel] [EXTERNAL] Re: Lustre Arm stuff status and work plan

Kevin Zhao kevin.zhao at linaro.org
Sun Mar 20 05:26:19 PDT 2022


Hi James,

Thanks! I think the abstract is quite nice.

On Sat, 19 Mar 2022 at 06:47, Simmons, James <simmonsja at ornl.gov> wrote:

> By joint talk I'm referring to a presentation at LUG about how to do this
> type of setup. He is the abstract I'm thinking of sending to easychair.
> Feedback welcomed.
>
> The scope of Lustre support matrix has grown due to the community
> involvement but the testing has been very limited and the results often go
> unpublished. Whamcloud is resourced restrained to support every platform
> requested. Discusses have opened up about how to create testing sites
> outside of Whamcloud that are tied into the general testing frame work.
> This is with the goal of enabling external sites to running the equivalent
> tests Whamcloud's does when work is presented to OpenSFS Lustre source tree
> and have the results published the the general Maloo testing framework.
> This talk will go over what needs to be done to create such a setup.
> ------------------------------
> *From:* Kevin Zhao <kevin.zhao at linaro.org>
> *Sent:* Tuesday, March 15, 2022 7:25 PM
> *To:* Simmons, James <simmonsja at ornl.gov>
> *Cc:* Xinliang Liu via lustre-devel <lustre-devel at lists.lustre.org>;
> Peter Jones <pjones at whamcloud.com>
> *Subject:* Re: [EXTERNAL] Re: [lustre-devel] Lustre Arm stuff status and
> work plan
>
> Hi James,
>
> It would be great! Look forward to having a joint talk on LUG2022 :-).
>
> Xinliang and I are now working on the test cluster setup and hopefully, we
> will have some progress quite soon.
>
> On Wed, 16 Mar 2022 at 06:09, Simmons, James <simmonsja at ornl.gov> wrote:
>
> Hello.
>
>     I have been watching your efforts to doing your own testing and this
> is something ORNL has been interested in as well.
> I was thinking would you be willing to do a joint talk at LUG on this
> effort. We can pool our knowledge on how to doing
> local testing and feeding it back to WC. Would you be interested?
> ------------------------------
> *From:* lustre-devel <lustre-devel-bounces at lists.lustre.org> on behalf of
> Kevin Zhao via lustre-devel <lustre-devel at lists.lustre.org>
> *Sent:* Friday, March 11, 2022 1:28 AM
> *To:* Oleg Drokin <green at whamcloud.com>
> *Cc:* Li Xi <lixi at ddn.com>; Jian Yu <jiyu at whamcloud.com>;
> cloud-dev-request at op-lists.linaro.org <
> cloud-dev-request at op-lists.linaro.org>; Xinliang Liu via lustre-devel <
> lustre-devel at lists.lustre.org>
> *Subject:* [EXTERNAL] Re: [lustre-devel] Lustre Arm stuff status and work
> plan
>
> Thanks Oleg,
>
> I will update the progress for the test clusters setup on Arm64 platform.
>
> On Mon, 28 Feb 2022 at 13:36, Oleg Drokin <green at whamcloud.com> wrote:
>
> Hello!
>
>   the sizing really depends on your test scaling requirements.
>   For example my own test infrastructure is a couple builders + 4 nodes
> for VMs (each has 256G RAM), 160 VM pairs in total,
>   and on a particularly busy day another 80 VM pairs can be added. This is
> to ensure speedy feedback to developers.
>   You can operate a much smaller scale testing system if you want, just
> keep in mind what is the longest running test would take
>   to understand how many patches could be tested in parallel (sometimes
> patch bombs result in 20+ patches submission at the same time).
>    Here’s stats for last 30 days. hxxps://imgur.com/lk2ogJv 1 item means
> single patch n processing. time in testing for a patch is typically about
> 3.5 hours.
>
> maloo shows the resources when you go into the test session, for example
> hxxps://testing.whamcloud.com/test_sessions/4de25b47-43fc-4bfc-87aa-15e4968519a7
> - scroll down to see list of nodes
>
>
>
> On Feb 18, 2022, at 3:05 AM, Kevin Zhao <kevin.zhao at linaro.org> wrote:
>
> Hi All,
>
> Greetings and thanks a lot for your comments! Xinliang and I are from
> Linaro, an organization focusing on Arm open-source ecosystem development.
> We have been working on Lustre on the Arm64 server and client end for a
> while now, already fixing a few bugs on arm64.
> As Xinliang said before, we want to enable the Arm64 CI, Oleg advises
> that we can plug our own CI nodes into the Jenkins. Now we want to
> understand and estimate how many machines resources can meet our requests,
> and doing the next stage plan of our hardware to meet the Lustre test
> requirements.
>
> As I understand, the test jobs will cover the ZFS and Ldiskfs backend with
> 2 scenarios:
>
>    - Lustre Arm64 Server + Arm64 Client( High Priority )
>    - Lustre Arm64 Server + x86_64 Client
>
> After going through the Lustre test website:
> hxxps://testing.whamcloud.com/test_sessions, it is quite clear to show
> the test info, and still remain some questions, that will be great if the
> community can give me a clear answer.
> 1. Is there a link to show all the machine resources? Including the
> machine info, CPU, memory and peripheral info.
> 2. Do we have a CI infra arch overview diagram to show the machine usage
> and communication?
> 3. How many machines are needed to meet the request of the Lustre Arm64
> Server + Arm64 Client test?
>
> Thanks a lot for your time, and look forward to your response.
>
>
> On Tue, 28 Dec 2021 at 09:58, Oleg Drokin <green at whamcloud.com> wrote:
>
>
>
> On Dec 27, 2021, at 8:53 PM, Xinliang Liu <xinliang.liu at linaro.org> wrote:
>
> Maloo is just one place to link to to actually let people see the results,
> but you can link to external resources too
> like e.g. gatekeeper janitor helper does or assuming the information is
> small enough it could be entirely contained
> in the comment (like say for a build failure)
>
>
> Ok, understand now. Is there any other reference external CI that posts
> results to Lustre gerrit now?
>
>
> Currently there are:
> - checkpatch and Misc code checks (smach) that post their results as 100%
> comment only. they share codebase pretty much
> - the Janitor (also started with above codebase but got changed and
> extended a lot)
>
> There was external interest in the past to post results to gerrit but it
> never materialized in the end
>
>
>
> --
> *Best Regards*
>
> *Kevin Zhao*
>
> Tech Lead, LDCG Cloud Infrastructure
>
> Linaro Vertical Technologies
>
> IRC(freenode): kevinz
>
> Slack(kubernetes.slack.com): kevinz
>
> kevin.zhao at linaro.org | Mobile/Direct/Wechat:  +86 18818270915
>
>
>
>
> --
> *Best Regards*
>
> *Kevin Zhao*
>
> Tech Lead, LDCG Cloud Infrastructure
>
> Linaro Vertical Technologies
>
> IRC(freenode): kevinz
>
> Slack(kubernetes.slack.com): kevinz
>
> kevin.zhao at linaro.org | Mobile/Direct/Wechat:  +86 18818270915
>
>
>
> --
> *Best Regards*
>
> *Kevin Zhao*
>
> Tech Lead, LDCG Cloud Infrastructure
>
> Linaro Vertical Technologies
>
> IRC(freenode): kevinz
>
> Slack(kubernetes.slack.com): kevinz
>
> kevin.zhao at linaro.org | Mobile/Direct/Wechat:  +86 18818270915
>
>

-- 
*Best Regards*

*Kevin Zhao*

Tech Lead, LDCG Cloud Infrastructure

Linaro Vertical Technologies

IRC(freenode): kevinz

Slack(kubernetes.slack.com): kevinz

kevin.zhao at linaro.org | Mobile/Direct/Wechat:  +86 18818270915
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20220320/40af6ca7/attachment-0001.html>


More information about the lustre-devel mailing list