UCL E2E Monitoring Workshop

Peter Clarke organized and hosted an informal workshop at the University College London on 5/15/03, entitled UCL E2E Monitoring Workshop.

Topics of discussion included Internet2's E2E piPEs project and ongoing efforts and interests at DANTE, CLRC-DL, SLAC, UCL, and UKERNA.

Participants:

Many participants (marked above with an asterisk) continued meeting on 5/16/03, with the primary focus being a revision of the v1 piPEs archictecture. The revised architecture incorporated both new ideas and ideas gleaned from the previous day's workshop. Major accomplishments included a cleaner definition of modules, a cleaner definition of protocols, a fully integrated scheme for access, authentication, and authorization, and a scheme designed to minimize DOS attacks (either as a hijacked participant or as the target). These ideas will be sent out to the Workshop participants and other interested parties for further review. See below for the outcome of the architecture revisions.

Presentations:

Matt Zekauskas's Workshop Notes:

Workshop at UCL on End-to-End Performance
15-May-2003

NOTE: these are notes, not minutes, from the meeting, and therefore
have not been distilled to any extent, and are not complete; they
are not necessarily even complete sentences.  However, they do give
an idea of what went on. In some cases, speakers are noted by thier
initials.  

Participants:

Peter Clarke, UCL (pc)
  In principle particle physics group,  but does more computing

Victor Reijs, HEANET (vr)
  Doing work for DANTE looking for possible implementation of 
  multi-domain measurement infrastructure in Europe using DANTE network 
  and equipment.  "Also into NRNs"

Matt Zekauskas, Internet2 (mz)
  Engineer with Internet2 working mostly on measurement and performance
  issues.

Eric Boyd, Internet2 (eb)
  Engineer with Internet2 on the End-to-End Performance Initiative,
  working on piPeS [Performance Improvement Performance...] project

Paul Mealor, UCL (pm)
  Working on publication of network monitoring results, especially
  into the GRID environment.

Paul Bright-Thomas, UCL (pb)
  Helping Peter with the workshop

David Salmon, UKERNA (the academic network in UK) (ds)
  David is UKERNA's laison with the research community. He wants to
  understand the community needs, but has a particular interest in 
  network monitoring from an operators point of view.

Duncan Rogerson, UKERNA.  (dr)
  Worked on JANET for a while.  Has a development / architcting job,
  and wants to focus on monitoring and end to end performance issues

Richard Hughes-Jones, UKERNA.  (hh)
  Work with David & Mark
  He's in terested in monitoring at various levels.  He is here to
  see what needs to be done, and consider the more general case.

Mark Godfrey, UKERNEA. (mg)
  (relatively new) Network monitoring & end to end performance.

Warren Matthews,  SLAC - ESnet in the US (wm)
  At SLAC, the big experiment is babar.  Terabytes of data 
  moved to processing farm, so networks important, and
  therefore monitoring important.

Yee-Ting Li,  PhD at UCL.  (ytl)
  Primary interest is to do thesis, and do some work on pipes.
  He is interested in more generic roles in monitoring, and
  the implicatoins of large systems.

Mark Leese, .  (ml)
   UK E-science program.  Looking after gridmon (network
   monitoring toolkit).  He wants to see what pipes can do for him, 
   and maybe what he can do for pipes

Nicolas Simar, DANTE. (ns)
   Working on an end-to-end performance monitoring and debugging system
   for DANTE, in conjunction with Terena's TF-NGN (hence works with
   Victor as well as Simon Leinen)

remotely,
George Brett, Internet2
Russ Hobby, Internet2
   listened in


============

Peter Clarke opened the meeting.  It was purposely not advertised, but
invitation-only so it can be a working meeting.  His particular interest
or angle: interested in near operations side of networks as they affect
applications.  Definitely an applications focus.

UCL is formally a cetner of excellence in e-science in UK,
to help applications, and in particular grid-based applications (and
"grids" themselves) to work.

Why here?
  There are/will be measurement points in some form at core routers (GEANT)
    useful for network operations


What about "grid operations"?
  There are serious professional grid infrastructures being created
  around Europe.

Then how can applications use them.

If putting performance measurement points in routers in any way (publishing
  information), let's publish in a common way ... preferably through 
  "standards/proposals in GGF"
  The information should only be available with proper authorizaion.

  There are various portals to look at information
  Internet2, GEANT, Operations, Grid Operations

If in middle anything with OGSA, then it brings in all the
  open grid services

---------
David Salmon wanted to state his interest before the meeting really
got under way by way of introduction as well.

  monitoring - working right?
  strands about traffic on backbone
    current focus is separate from network operations focus
    required to report traffic loads to superiors
    so do it for "service level agreement monitoring"

    UKERNA has 20 regional networks
    and a backbone
    want to measure traffic to sites

    want to carry forward
    like escience / data models for gathering
    want main thrust - see what's going on in abilene, europe, look for
     hook in 

    Mark Godfrey & Mark Leese -> both funded by escience program
      Mark G: from perspective of backbone
      Mark L: e-science... applications

   In the short term only have resources for simple stuff, mrtg +.
   but also gather other information; Looking to integrate with mark leese
   measurements... then forward through publishing    angle

   longer term possibilities: production and development network
     tightly focused development in crucial positions within
     backbones
     borders w/regionals
        install platforms capable of taking on board broader things
        such as the TF-NGN project in europe, pipes in us

     For the even longer term - how expand that capability for more coherent
      not just national, but also regionals... 
      create an architecture that regionals can opt-in to.
      ..as part of next version of janet.

sort-term standards.
real-time stuff  and IPv6 will require thinking
  more extended platforms 

pc: like not being descriptive, but defining interface and publish & how
 (and then not telling people how to do it)
 [but give 'em something that want]

For e-science infrastructure in UK, Mark is doing iperf measurements, but 
  very [intended application] flow specific.

----------
Next up is Eric Boyd to explain the pipes infrastructure being developed
by Internet2.

[see slides, questions noted here]

ds: very interested in observatory

?: what are the differences among the End-to-end Performance Initiative,
   pipes, and the Abilene Measurement Infrastructure (AMI)?
   [leading question! - see slides]

ds: what about shibboleth?
  we need some authentication mechanism, shibboleth is one example.

pc: About the measurement results themselves, are they available to everyone,
    or are some protected?
  because people can misuse - and in some cases badger people in operations
  centers  wrongly
eb: possibly availability also based on role

result: add policy lookup on read of db, can be null, but should be there

eb: we haven't thought alot about policy, stance tends to be open but
we recognize a need.

General pipes goals:
Build the system as modular as possible
  so can swap in new stuff

open source as much as possible
  modified Berkely license for Internet2 software (in particular, 
     not GPL which hinders our ability to work with corporate partners)


We see an evolving set of tools over time, including
  "owamp" - one way ping
  throughput
  flow data, anonymized 
  traceroute data (initially literally traceroute
   eventually get routing database)
   like dynamic looking glass
  snmp data out of router interface

open research questions include
 algorithms -  encoding a network engineer's brain
 measurement schema - common data formats

pc: How to discover PMP (performance measurement point) comment - just 
   because it can be a service doesn't mean it should be.
 For example, if the point is very closely connected to router... it's 
   not obvous that should advertise itself.
 You want the thing that a domain presents to world to be discoverable
 Can you define all sets of questions that can be asked?

 That's an "interface architecture" and it would be good to discuss.

As to security concerns, our current ranking
mainly concerns about DOS attacks
  (1) don't use infrastructure as a weapon
  (2) the infrastructure (data produced by the infrastructure) is not
      compromised by an attack

mz - eplains why 4 PCs currently, and abilene observatory idea

pc: (on scheduling active measurements) should be test (dataflow model) - 
   ensure don't clog network  or ...

ds: what's CDMA?
mz: one of the mobile phone technologies in US, has GPS time embedded
   in signals

q: owamp, output?  understanding tool
  not web service (no, Warren is wrapping output in a web service)

q: Internet2 detective - how does it fit in
  Currently separate, but one portal to measurement infrastructure

q: platforms? (PC or sparc or ...)

q: ntp, need gps or cdma
  depends on what you're trying to measure

----------
Nicolas Simar presented current TF-NGN monitoring work, focus is on
inter-domain: how have tools in backbone plus opt-in tools in other
administrative domains... how find tools in other domain, what protocol
between domains, include authorization.  Utilize existing measurements
at first.

[see slides]

domain tool - the entity that exists per-domain.  current focus.
  measurement points within domain, could be anything.  Initially
  want to use RIPE Test-Traffic information (cannot schedule tests,
  but can get results.)

test definition: like Internet2's owamp +

measurement box guidelines: like Internet2's templates

measurement protocol: for active measurements, what is format of packets?
  Perhaps can use OWAMP directly (or common tools?) [OWAMP the protocol,
  not the tool being tested by Internet2, is being standardized within
  the IETF (IP performance metrics (ippm) working group)

combining measurements:  How combine measurments taken of segments
  along an end-to-end path?
  - for one-way delay, add?
  - but what happens with jitter?

user representation: who present, and what can understand
  [what about program?]

pathfinder: how map measurement point with given IP address, say starting from 
 traceroute. (presuming src,dst IP or name from user)

tests
  RTT in addition to owd, so can compare, utilize many traditional
    RTT tools.

  also what about packet distributions (for active measurements)
   - what's best
   - looking for intput from existing systems

   other types: want input from NREN, TF-NGN, APM, etc.: so have
     common tests in all parts of network (not partitions)
     min level of tests deployed
     plus what other tests are desired

     want to be as widespread as possible

measurement boxes themselves: have input from
  HEANET
  GARR
  DANTE

GPS and NTP, perhaps can use longwave
  d-gps
  atomic clock

  It isn't all that important what tool.  A given tool has some accuracy, 
   **have to always specify accuracy of measurements**

q: is GPS used by RIPE TT  available independent of RIPE?
  yes
  need ntp config - important to get correct to be acurate

vr: always carry error
    also hotels in Ireland retrans 1 PPS signal.

q: who's in the trial?
   SWITCH, DANTE, ...

pc: seem to presuppose what arch must be
    chains domain tool to domain tool
    vs user hits multiple domain tools.

  not fully defined.  what we've presented is one coherent thought
  can access any domain tool if available
     referral is one way.


q: concatenated error if concatenate measurements?
  yes 

vr: hard problem; can concatenate averages, can't necessarily 
    concatenate percentiles

As to scheduling, do per domain controller.  
  [know things like, RIPE can't do more than 500,000 tests/sec]

q: 500,000 pps limit?  where does it come from?

  given to us from ripe, machine specific parameter
  expect will be other parameters for other equipment, this is just an
  example


start 
  one-way meas
  looking glass functionality
  detect dos - scampi passive mon infra

q: data format w/in proto or new proto

[note taker had to leave for 30 min.]

----------
LUNCH.

pc introduces the afternoon:

An XML Schema 
  motivation from top down, ggf down

mark leese works for rutherford laboratory
  sister lab of rutherford
  CLRC 

  end station mon - app level mon at all e-science centers around UK

  users end perspective

warren matthews IEPM

victor - TF-NGN view / work.
  functional design of domain tool

----------
An XML schema for NMWG Yee-Ting Li, UCL

metrics
  all stored in some nodes in some format
  usually flat text

extensions: store in db, query by sql

more generic approach: XML

NMWG (network measurement working group of GGF (global grid forum))
  characteristics & metrics document
   heirarchy
   ontology
  tools document
    maps specific tools ot specific points in heirarchy

  pull from IPPM
  but slightly different goals, but meanings are overlapping & orthogonal
  ->

  what ippm done, is how to measure, format of packets
   what metrics looking at

  what c&m doc describes: what characteristics actually is
   rather than what value is or how measure it

  in order to define characteristic, define measurement methodology
   > how to measure
  
  (IPPM merges two)

  "meaning aspect".  meanings to terms (not how measure)

(singleton)(sample)(statistic) -> observation
 --is a result of--> measurement methodology -- measures--> characteristics
  --describes--> network entity

entity: e2e path, or hop in path or snmp bandwidth or ...


using XML schemas to describe what they have represented

  can describe anything - metadata
  big industrial standard

  used throughtout web services and OGSA technologies

XML schema : describes how elements are placed

network entity element
target as endpoints

measurement methodology is hairy
technique for recording
lots of text

  stick most important characteristics in this document.

  tool: iperf, ping, ...
    usually how practically realize methodology
    version, name, and a list of parameters specific to tool
  layer: describes where in OSI layer tool is working, details
    of specific stack implementations, packet size
  path: a link or path, technically -- how tool would see network entity
    for e2e, see src,sink
    inside that path, descriptions of characteristic
       list of nodes

characteristics: 
  0 or more, of all types (delay, loss, ...)

tries to give results of multiple runs (historical set of results)

observation sample
  interval (of observations)
  observation
    interval (of singletons)
    singleton, statistics

[ping document sample shown]

q: this doc is an answer to what question

  please tell you everything between 2 ip addresses
  or this tool
 or ...

  ping results
  > round trip time measurements?

has network entity as an element, so this describes a network entity
meas methodology to work out characteristics between two elements.

this tries to describe a methodology
one way of realizing that is to describe tool such as ping
describes network entity - node a & node b in context of
network methodology


pc: you said you had to implement some things differently from NMWG document
  why?
  good reason?  yes, xml couldn't implement same way.
  ...should get reported back to GGF

-=-=-
Mark Leese
 UK e-sciences stuff.

has a tolkit installed on end systems within UK
  with exeption of bbcopy & bbftp work every 1/2 hr
  store data locally

1 machine to all. so build mesh.  ok, since 10-12 science centers, n^2 isn't
   intractable

web interface there
publication service is not there (originally thought about ldap)

* dedicated monitoring node
* similar connectivity to e-science resources

IperfER -> scripts to run iperf (from slac, originally)
MIperfER  -> do it in multicast mode!

  
pc: operational question
  are the tools demonstrably useable to end application
  or because network staff looks at it

  people looking on behalf of users...

pc:
iperf used by particle phyisics uk... look within community
  one finds things from time to time.  not too often.

skeptical, perceived to be used by people at ends yet
something we need to do.

so have stuff at e-science center, not at particle physics site...

vr: trying to set up PERT like CERT, so people can drop performance problems
  there and "will be solved"  then these tools & monitoring structure can
  be solved.

  planned to be running in sept.

http://www.hep.ucl.ac.uk/e2emon/e2emon.html

pc: John McAllester at Oxford, he's run most powerful tool to used in
  years: traceping.  has a good chance of finding some of the path
  problems.  combine ping   and traceroute.  one tool that used to
  come bang on your guy's door.
  [discussion about how useful it really is]

show where going.

q: precursor to AG tools time.  ran tools continuously, so used quite
  effectively to debug problems.

q: beacon stuff as well
   


wishlist features: ask.

http://gridmon.dl.ac.uk/

m.j.leese at dl.ac.uk; http://gridmon.dl.ac.uk/~mjl

----------

Warren Mathews

First, a side note aobut available bandwith estimation tool: ABWE
ABWE - agreement with iperf, measure less often

achievable bandwdith one thing
 two tools, different estimates
  error associated?

  turn argument around
  rather than this is the number this thing gives
    this is the meaning, estimator of that.

  when clock is started changes between the tools
   ..should be taken care of by meaning

On to publishing
  no interoperability among methods
    SOAP::lite perl module
    Python
    Java

NMWG
OGSA

--
Publishing

NMWGproperties document
  path.delay.roundtrip
  hop.

soap lite - patinfully simple
  point to wdsl descriptoin
  there is a method there
   pass arg (endpoints)

src,dst in one field - then can describe either.
slac security does not allow router names to be exposed, even
 w/in slac.

./tracespeed - similar to traceroute

need a conditions database.
  for example, need to know state of collider to understand phys experiment
  similarly, dump sys entries from kernel.

web service front end to Arena (an Internet2 networks database)
--
advisor - human visualization

python web service client within advisor to pull info
--
monalisa is big thing

CMS (physics database) it's the primary monitoring.

http://monalisa.cern.ch/MONALISA

monitor farms.
  info that's plugged in there, run through web services
   so can grab whatever I want.

why publish:
  troubleshooting
     RIPE-TT
     AMP Automatic Detection System

    using both of thos e + diurnal changes.

Looking for anomalies...

parameterize perf in terms of hr and variability w/in that hourly bin
measurments can be characterized in terms of how they differ from 
historical value
compare w/prev bin to reduce false-positives

median & standard deviation for last 5 measurements in bin (weeks)
concerned if latest measurement is more than 1 sd from median
alarmed if latest measurement is more than 2 sd from median

(tony's work is similar)

only write to the log if an alarm is triggered
keep writing until alarm is cleared

NetRat
------

alarm system
  multiple tools
  multiple measurement points
    * cross reference
  trigger future meas
  starting point for human intervention
  informant database: hop.performance
no measurement is authoratative

GLUE, OGSA, CIM
 ^ w/us
       ^ grid services
             ^nmwg

work in europe and work in US has drifted, so want to bring back
  publishing and troubleshooting
  discovery
  security

"it is widely believed that a ubiquitous monitoring infrastructure is required"
DOE apparently doesn't want to fund any more.


--------
victor going back to this morning

details on domain tool design

on (domain tool block)
q: specific tool driver -> specific tool

 could have been different, thing attached to router isn't specific
   tool interface

 snmp variables in router are measurement points.

could have interface that popped out nmwg characteristics out
  that is "generic".

that debate should go on (at least understand)

does have API interface or "driver interface" that does this
  (so switch partition slightly)

driver vs. collector


...

multi level data analysis, at least 3
 measurement point
 domain
 interdomain

aggregation function
  statistical functions on raw data
  adding averages, concatination, at multidomain level

should agree on main protocols
  

http://www.dante.net/tf-ngn/perfmonit/
nicolas.simar at dante.org.uk

http://www.dante.net/tf-ngn/pert
victor.reijs at heanet.ie

----------
Peter Clarke, wrap up:

pragmatically

domain - access interface
  same idea, but arbitrarily different implementation of ideas
  not constraining by doing access interface

vr: only really need to web, domain tools still working

from bottom up: have different things that measure, where is first point
we can comment

no resources for new, but meas made could be published

>>>box that takes measurement

>>>box that delivers one or more characteristics
  may not constrain minimum number of tools

Two functions on same piece of hardware
  bits flowing between boxes ("PMP")
  other stuff: local scheduler, store
   AAA, shed for machine, data store ("PMC")

 PMC-PMP might have 1:1 relationship
  but could be that PMC reps one or more, and not necessarily colocated.

that bridges difference between Eric's outline and what Nicolas's
 (or domain concept)

doesn't matter what we get out as long as common.

Architecture Revisions:

On Day 2, we did a reasonable amount of work revising the v1.0 piPEs architecture picture. The following diagram replaces the Scheduler, a pair of PMPs, the result arbiter, the test arbiter, and the database.

A bit about the new concepts:

Mapping old to new, the Source and Sink both correspond to PMPs on the original diagram. The test arbiter, central scheduler and individuals PMPs are replaced by a domain interface (representing the test arbiter and administrative domain's AAA), the PMC (representing the measurement nodes local scheduler and AAA), and the PMP (representing the barebones ability to accept a test initiation request, to run a test, to store the data locally, and to send it to the database gatekeeper (which used to be called the result arbiter).

The idea here is that these three components can be thought of as independent processes that may or may not run on the same machine. There may be one or more domain interfaces per administrative domain. There may be one or more PMCs per domain interface. There may be one or more PMPs per PMC. One common case is likely to be a single domain interface, and several PMC/PMP/hardware packages per administrative domain.

In the (rare?) case where the source of the test does not capture the performance data, it is assumed that the tool is wrapped in a script that makes it so. It is also assumed that *all* tests have a source and a sync. Even in the nominally "one-ended" case of 2-way ping, this would enable the sink to unblock a firewall, for example, and would also prevent an overload of pings against the sink.

There's an implicity restriction on the source PMC and PMP, requiring that they ONLY perform tests with other, known PMC/PMP machines. This has the desired effect of reducing the chance that a source could be hijacked for DOS attacks against unrelated machines.

Our design goals included modularity, minimizing risk DOS attacks (against or as agent thereof), flexibility of deployment, clean interfaces, and applicability to most tools.

Note, we fully expect to design an additional module, currently represented by a dotted line, that passes through A and P unchanged, but links B to O.

Here's the protocol diagram we came up with on 5/16/03:


Here's a revised version of the Architecture diagram I did after the fact to match up with the protocol diagram:


Visio File
A. Here is who I am, and I want "this result" to exist (characteristic)
   and that includes the IP addresses of the two routers [[may be 1 if
   it's a passive result]]
   - newness of test result
B. *no, go away [rejected (and here's why)]
  *yes, result exists  (with pointer)
  *be patient, I will send response later (maybe with time estimate)
 [Q: send response later or ask to call back?]
C. I would like to contact the PMC associated with this router,
    here's who I am, and here's who I am asking on behalf of,
    and here's the tool I'd like to run
D. *Get lost [not authorized, tool unavailable]
  *IP address of sink PMC, capability to use
   [or under hood, tell sink PMC to trust source PMC for a little while]
E. Initiate tool
   - request interface token
   - tool and argments to run
   - Sink PCM IP address, capability {is a token}
F. here's my capability, can we start tool X with certain parameters now?
   [[need source PMP IP address?]]
G. Start receptor
   - prepare to receive tool
   - IP address of source could be parameter [[never given?]]
   - Identity of Sink PMC (token)
H. *OK
  *Rejected can't do it
  [[pmp doesn't do scheduling]]
I. *Rejected (don't like that capability; failed ot initiate receptor)
  *willing and able, IP addr of sink PMP (really machine) to use
  *Ask again in +T time units
J. Start tool:
    - identity of source PMC (token)
    - tool
    - arguments for tool
    - IP addr of sink PMP
K. Test stream from tool
L. Results
    - identity of source PMP
    - identity of sink PMP
    - characteristic [[?]]
    - tool output [[in standard form?]]
    - [[accuracy, other ancilliary data that is relevant]]
    (implicit ACK from DB, L is a stream)
M. Result report
   - id of src PMP
   - result ready, pointer to result (pointer contains capability?)
   - result failed
       -- test failed
       -- write to DB failed
N. Result report
   - id of src PMC
   - result ready, pointer to result
   - result failed
      --test failed
      --write to DB failed
   (message B, actually is next)
O. Give me result
    - who I am
P. *Rejected
  *Result

Action Items: