Peter Clarke organized and hosted an informal workshop at the University College London on 5/15/03, entitled UCL E2E Monitoring Workshop.
Topics of discussion included Internet2's E2E piPEs project and ongoing efforts and interests at DANTE, CLRC-DL, SLAC, UCL, and UKERNA.
Many participants (marked above with an asterisk) continued meeting on 5/16/03, with the primary focus being a revision of the v1 piPEs archictecture. The revised architecture incorporated both new ideas and ideas gleaned from the previous day's workshop. Major accomplishments included a cleaner definition of modules, a cleaner definition of protocols, a fully integrated scheme for access, authentication, and authorization, and a scheme designed to minimize DOS attacks (either as a hijacked participant or as the target). These ideas will be sent out to the Workshop participants and other interested parties for further review. See below for the outcome of the architecture revisions.
Workshop at UCL on End-to-End Performance
15-May-2003
NOTE: these are notes, not minutes, from the meeting, and therefore
have not been distilled to any extent, and are not complete; they
are not necessarily even complete sentences. However, they do give
an idea of what went on. In some cases, speakers are noted by thier
initials.
Participants:
Peter Clarke, UCL (pc)
In principle particle physics group, but does more computing
Victor Reijs, HEANET (vr)
Doing work for DANTE looking for possible implementation of
multi-domain measurement infrastructure in Europe using DANTE network
and equipment. "Also into NRNs"
Matt Zekauskas, Internet2 (mz)
Engineer with Internet2 working mostly on measurement and performance
issues.
Eric Boyd, Internet2 (eb)
Engineer with Internet2 on the End-to-End Performance Initiative,
working on piPeS [Performance Improvement Performance...] project
Paul Mealor, UCL (pm)
Working on publication of network monitoring results, especially
into the GRID environment.
Paul Bright-Thomas, UCL (pb)
Helping Peter with the workshop
David Salmon, UKERNA (the academic network in UK) (ds)
David is UKERNA's laison with the research community. He wants to
understand the community needs, but has a particular interest in
network monitoring from an operators point of view.
Duncan Rogerson, UKERNA. (dr)
Worked on JANET for a while. Has a development / architcting job,
and wants to focus on monitoring and end to end performance issues
Richard Hughes-Jones, UKERNA. (hh)
Work with David & Mark
He's in terested in monitoring at various levels. He is here to
see what needs to be done, and consider the more general case.
Mark Godfrey, UKERNEA. (mg)
(relatively new) Network monitoring & end to end performance.
Warren Matthews, SLAC - ESnet in the US (wm)
At SLAC, the big experiment is babar. Terabytes of data
moved to processing farm, so networks important, and
therefore monitoring important.
Yee-Ting Li, PhD at UCL. (ytl)
Primary interest is to do thesis, and do some work on pipes.
He is interested in more generic roles in monitoring, and
the implicatoins of large systems.
Mark Leese, . (ml)
UK E-science program. Looking after gridmon (network
monitoring toolkit). He wants to see what pipes can do for him,
and maybe what he can do for pipes
Nicolas Simar, DANTE. (ns)
Working on an end-to-end performance monitoring and debugging system
for DANTE, in conjunction with Terena's TF-NGN (hence works with
Victor as well as Simon Leinen)
remotely,
George Brett, Internet2
Russ Hobby, Internet2
listened in
============
Peter Clarke opened the meeting. It was purposely not advertised, but
invitation-only so it can be a working meeting. His particular interest
or angle: interested in near operations side of networks as they affect
applications. Definitely an applications focus.
UCL is formally a cetner of excellence in e-science in UK,
to help applications, and in particular grid-based applications (and
"grids" themselves) to work.
Why here?
There are/will be measurement points in some form at core routers (GEANT)
useful for network operations
What about "grid operations"?
There are serious professional grid infrastructures being created
around Europe.
Then how can applications use them.
If putting performance measurement points in routers in any way (publishing
information), let's publish in a common way ... preferably through
"standards/proposals in GGF"
The information should only be available with proper authorizaion.
There are various portals to look at information
Internet2, GEANT, Operations, Grid Operations
If in middle anything with OGSA, then it brings in all the
open grid services
---------
David Salmon wanted to state his interest before the meeting really
got under way by way of introduction as well.
monitoring - working right?
strands about traffic on backbone
current focus is separate from network operations focus
required to report traffic loads to superiors
so do it for "service level agreement monitoring"
UKERNA has 20 regional networks
and a backbone
want to measure traffic to sites
want to carry forward
like escience / data models for gathering
want main thrust - see what's going on in abilene, europe, look for
hook in
Mark Godfrey & Mark Leese -> both funded by escience program
Mark G: from perspective of backbone
Mark L: e-science... applications
In the short term only have resources for simple stuff, mrtg +.
but also gather other information; Looking to integrate with mark leese
measurements... then forward through publishing angle
longer term possibilities: production and development network
tightly focused development in crucial positions within
backbones
borders w/regionals
install platforms capable of taking on board broader things
such as the TF-NGN project in europe, pipes in us
For the even longer term - how expand that capability for more coherent
not just national, but also regionals...
create an architecture that regionals can opt-in to.
..as part of next version of janet.
sort-term standards.
real-time stuff and IPv6 will require thinking
more extended platforms
pc: like not being descriptive, but defining interface and publish & how
(and then not telling people how to do it)
[but give 'em something that want]
For e-science infrastructure in UK, Mark is doing iperf measurements, but
very [intended application] flow specific.
----------
Next up is Eric Boyd to explain the pipes infrastructure being developed
by Internet2.
[see slides, questions noted here]
ds: very interested in observatory
?: what are the differences among the End-to-end Performance Initiative,
pipes, and the Abilene Measurement Infrastructure (AMI)?
[leading question! - see slides]
ds: what about shibboleth?
we need some authentication mechanism, shibboleth is one example.
pc: About the measurement results themselves, are they available to everyone,
or are some protected?
because people can misuse - and in some cases badger people in operations
centers wrongly
eb: possibly availability also based on role
result: add policy lookup on read of db, can be null, but should be there
eb: we haven't thought alot about policy, stance tends to be open but
we recognize a need.
General pipes goals:
Build the system as modular as possible
so can swap in new stuff
open source as much as possible
modified Berkely license for Internet2 software (in particular,
not GPL which hinders our ability to work with corporate partners)
We see an evolving set of tools over time, including
"owamp" - one way ping
throughput
flow data, anonymized
traceroute data (initially literally traceroute
eventually get routing database)
like dynamic looking glass
snmp data out of router interface
open research questions include
algorithms - encoding a network engineer's brain
measurement schema - common data formats
pc: How to discover PMP (performance measurement point) comment - just
because it can be a service doesn't mean it should be.
For example, if the point is very closely connected to router... it's
not obvous that should advertise itself.
You want the thing that a domain presents to world to be discoverable
Can you define all sets of questions that can be asked?
That's an "interface architecture" and it would be good to discuss.
As to security concerns, our current ranking
mainly concerns about DOS attacks
(1) don't use infrastructure as a weapon
(2) the infrastructure (data produced by the infrastructure) is not
compromised by an attack
mz - eplains why 4 PCs currently, and abilene observatory idea
pc: (on scheduling active measurements) should be test (dataflow model) -
ensure don't clog network or ...
ds: what's CDMA?
mz: one of the mobile phone technologies in US, has GPS time embedded
in signals
q: owamp, output? understanding tool
not web service (no, Warren is wrapping output in a web service)
q: Internet2 detective - how does it fit in
Currently separate, but one portal to measurement infrastructure
q: platforms? (PC or sparc or ...)
q: ntp, need gps or cdma
depends on what you're trying to measure
----------
Nicolas Simar presented current TF-NGN monitoring work, focus is on
inter-domain: how have tools in backbone plus opt-in tools in other
administrative domains... how find tools in other domain, what protocol
between domains, include authorization. Utilize existing measurements
at first.
[see slides]
domain tool - the entity that exists per-domain. current focus.
measurement points within domain, could be anything. Initially
want to use RIPE Test-Traffic information (cannot schedule tests,
but can get results.)
test definition: like Internet2's owamp +
measurement box guidelines: like Internet2's templates
measurement protocol: for active measurements, what is format of packets?
Perhaps can use OWAMP directly (or common tools?) [OWAMP the protocol,
not the tool being tested by Internet2, is being standardized within
the IETF (IP performance metrics (ippm) working group)
combining measurements: How combine measurments taken of segments
along an end-to-end path?
- for one-way delay, add?
- but what happens with jitter?
user representation: who present, and what can understand
[what about program?]
pathfinder: how map measurement point with given IP address, say starting from
traceroute. (presuming src,dst IP or name from user)
tests
RTT in addition to owd, so can compare, utilize many traditional
RTT tools.
also what about packet distributions (for active measurements)
- what's best
- looking for intput from existing systems
other types: want input from NREN, TF-NGN, APM, etc.: so have
common tests in all parts of network (not partitions)
min level of tests deployed
plus what other tests are desired
want to be as widespread as possible
measurement boxes themselves: have input from
HEANET
GARR
DANTE
GPS and NTP, perhaps can use longwave
d-gps
atomic clock
It isn't all that important what tool. A given tool has some accuracy,
**have to always specify accuracy of measurements**
q: is GPS used by RIPE TT available independent of RIPE?
yes
need ntp config - important to get correct to be acurate
vr: always carry error
also hotels in Ireland retrans 1 PPS signal.
q: who's in the trial?
SWITCH, DANTE, ...
pc: seem to presuppose what arch must be
chains domain tool to domain tool
vs user hits multiple domain tools.
not fully defined. what we've presented is one coherent thought
can access any domain tool if available
referral is one way.
q: concatenated error if concatenate measurements?
yes
vr: hard problem; can concatenate averages, can't necessarily
concatenate percentiles
As to scheduling, do per domain controller.
[know things like, RIPE can't do more than 500,000 tests/sec]
q: 500,000 pps limit? where does it come from?
given to us from ripe, machine specific parameter
expect will be other parameters for other equipment, this is just an
example
start
one-way meas
looking glass functionality
detect dos - scampi passive mon infra
q: data format w/in proto or new proto
[note taker had to leave for 30 min.]
----------
LUNCH.
pc introduces the afternoon:
An XML Schema
motivation from top down, ggf down
mark leese works for rutherford laboratory
sister lab of rutherford
CLRC
end station mon - app level mon at all e-science centers around UK
users end perspective
warren matthews IEPM
victor - TF-NGN view / work.
functional design of domain tool
----------
An XML schema for NMWG Yee-Ting Li, UCL
metrics
all stored in some nodes in some format
usually flat text
extensions: store in db, query by sql
more generic approach: XML
NMWG (network measurement working group of GGF (global grid forum))
characteristics & metrics document
heirarchy
ontology
tools document
maps specific tools ot specific points in heirarchy
pull from IPPM
but slightly different goals, but meanings are overlapping & orthogonal
->
what ippm done, is how to measure, format of packets
what metrics looking at
what c&m doc describes: what characteristics actually is
rather than what value is or how measure it
in order to define characteristic, define measurement methodology
> how to measure
(IPPM merges two)
"meaning aspect". meanings to terms (not how measure)
(singleton)(sample)(statistic) -> observation
--is a result of--> measurement methodology -- measures--> characteristics
--describes--> network entity
entity: e2e path, or hop in path or snmp bandwidth or ...
using XML schemas to describe what they have represented
can describe anything - metadata
big industrial standard
used throughtout web services and OGSA technologies
XML schema : describes how elements are placed
network entity element
target as endpoints
measurement methodology is hairy
technique for recording
lots of text
stick most important characteristics in this document.
tool: iperf, ping, ...
usually how practically realize methodology
version, name, and a list of parameters specific to tool
layer: describes where in OSI layer tool is working, details
of specific stack implementations, packet size
path: a link or path, technically -- how tool would see network entity
for e2e, see src,sink
inside that path, descriptions of characteristic
list of nodes
characteristics:
0 or more, of all types (delay, loss, ...)
tries to give results of multiple runs (historical set of results)
observation sample
interval (of observations)
observation
interval (of singletons)
singleton, statistics
[ping document sample shown]
q: this doc is an answer to what question
please tell you everything between 2 ip addresses
or this tool
or ...
ping results
> round trip time measurements?
has network entity as an element, so this describes a network entity
meas methodology to work out characteristics between two elements.
this tries to describe a methodology
one way of realizing that is to describe tool such as ping
describes network entity - node a & node b in context of
network methodology
pc: you said you had to implement some things differently from NMWG document
why?
good reason? yes, xml couldn't implement same way.
...should get reported back to GGF
-=-=-
Mark Leese
UK e-sciences stuff.
has a tolkit installed on end systems within UK
with exeption of bbcopy & bbftp work every 1/2 hr
store data locally
1 machine to all. so build mesh. ok, since 10-12 science centers, n^2 isn't
intractable
web interface there
publication service is not there (originally thought about ldap)
* dedicated monitoring node
* similar connectivity to e-science resources
IperfER -> scripts to run iperf (from slac, originally)
MIperfER -> do it in multicast mode!
pc: operational question
are the tools demonstrably useable to end application
or because network staff looks at it
people looking on behalf of users...
pc:
iperf used by particle phyisics uk... look within community
one finds things from time to time. not too often.
skeptical, perceived to be used by people at ends yet
something we need to do.
so have stuff at e-science center, not at particle physics site...
vr: trying to set up PERT like CERT, so people can drop performance problems
there and "will be solved" then these tools & monitoring structure can
be solved.
planned to be running in sept.
http://www.hep.ucl.ac.uk/e2emon/e2emon.html
pc: John McAllester at Oxford, he's run most powerful tool to used in
years: traceping. has a good chance of finding some of the path
problems. combine ping and traceroute. one tool that used to
come bang on your guy's door.
[discussion about how useful it really is]
show where going.
q: precursor to AG tools time. ran tools continuously, so used quite
effectively to debug problems.
q: beacon stuff as well
wishlist features: ask.
http://gridmon.dl.ac.uk/
m.j.leese at dl.ac.uk; http://gridmon.dl.ac.uk/~mjl
----------
Warren Mathews
First, a side note aobut available bandwith estimation tool: ABWE
ABWE - agreement with iperf, measure less often
achievable bandwdith one thing
two tools, different estimates
error associated?
turn argument around
rather than this is the number this thing gives
this is the meaning, estimator of that.
when clock is started changes between the tools
..should be taken care of by meaning
On to publishing
no interoperability among methods
SOAP::lite perl module
Python
Java
NMWG
OGSA
--
Publishing
NMWGproperties document
path.delay.roundtrip
hop.
soap lite - patinfully simple
point to wdsl descriptoin
there is a method there
pass arg (endpoints)
src,dst in one field - then can describe either.
slac security does not allow router names to be exposed, even
w/in slac.
./tracespeed - similar to traceroute
need a conditions database.
for example, need to know state of collider to understand phys experiment
similarly, dump sys entries from kernel.
web service front end to Arena (an Internet2 networks database)
--
advisor - human visualization
python web service client within advisor to pull info
--
monalisa is big thing
CMS (physics database) it's the primary monitoring.
http://monalisa.cern.ch/MONALISA
monitor farms.
info that's plugged in there, run through web services
so can grab whatever I want.
why publish:
troubleshooting
RIPE-TT
AMP Automatic Detection System
using both of thos e + diurnal changes.
Looking for anomalies...
parameterize perf in terms of hr and variability w/in that hourly bin
measurments can be characterized in terms of how they differ from
historical value
compare w/prev bin to reduce false-positives
median & standard deviation for last 5 measurements in bin (weeks)
concerned if latest measurement is more than 1 sd from median
alarmed if latest measurement is more than 2 sd from median
(tony's work is similar)
only write to the log if an alarm is triggered
keep writing until alarm is cleared
NetRat
------
alarm system
multiple tools
multiple measurement points
* cross reference
trigger future meas
starting point for human intervention
informant database: hop.performance
no measurement is authoratative
GLUE, OGSA, CIM
^ w/us
^ grid services
^nmwg
work in europe and work in US has drifted, so want to bring back
publishing and troubleshooting
discovery
security
"it is widely believed that a ubiquitous monitoring infrastructure is required"
DOE apparently doesn't want to fund any more.
--------
victor going back to this morning
details on domain tool design
on (domain tool block)
q: specific tool driver -> specific tool
could have been different, thing attached to router isn't specific
tool interface
snmp variables in router are measurement points.
could have interface that popped out nmwg characteristics out
that is "generic".
that debate should go on (at least understand)
does have API interface or "driver interface" that does this
(so switch partition slightly)
driver vs. collector
...
multi level data analysis, at least 3
measurement point
domain
interdomain
aggregation function
statistical functions on raw data
adding averages, concatination, at multidomain level
should agree on main protocols
http://www.dante.net/tf-ngn/perfmonit/
nicolas.simar at dante.org.uk
http://www.dante.net/tf-ngn/pert
victor.reijs at heanet.ie
----------
Peter Clarke, wrap up:
pragmatically
domain - access interface
same idea, but arbitrarily different implementation of ideas
not constraining by doing access interface
vr: only really need to web, domain tools still working
from bottom up: have different things that measure, where is first point
we can comment
no resources for new, but meas made could be published
>>>box that takes measurement
>>>box that delivers one or more characteristics
may not constrain minimum number of tools
Two functions on same piece of hardware
bits flowing between boxes ("PMP")
other stuff: local scheduler, store
AAA, shed for machine, data store ("PMC")
PMC-PMP might have 1:1 relationship
but could be that PMC reps one or more, and not necessarily colocated.
that bridges difference between Eric's outline and what Nicolas's
(or domain concept)
doesn't matter what we get out as long as common.
On Day 2, we did a reasonable amount of work revising the v1.0 piPEs architecture picture. The following diagram replaces the Scheduler, a pair of PMPs, the result arbiter, the test arbiter, and the database.
A bit about the new concepts:
Mapping old to new, the Source and Sink both correspond to PMPs on the original diagram. The test arbiter, central scheduler and individuals PMPs are replaced by a domain interface (representing the test arbiter and administrative domain's AAA), the PMC (representing the measurement nodes local scheduler and AAA), and the PMP (representing the barebones ability to accept a test initiation request, to run a test, to store the data locally, and to send it to the database gatekeeper (which used to be called the result arbiter).
The idea here is that these three components can be thought of as independent processes that may or may not run on the same machine. There may be one or more domain interfaces per administrative domain. There may be one or more PMCs per domain interface. There may be one or more PMPs per PMC. One common case is likely to be a single domain interface, and several PMC/PMP/hardware packages per administrative domain.
In the (rare?) case where the source of the test does not capture the performance data, it is assumed that the tool is wrapped in a script that makes it so. It is also assumed that *all* tests have a source and a sync. Even in the nominally "one-ended" case of 2-way ping, this would enable the sink to unblock a firewall, for example, and would also prevent an overload of pings against the sink.
There's an implicity restriction on the source PMC and PMP, requiring that they ONLY perform tests with other, known PMC/PMP machines. This has the desired effect of reducing the chance that a source could be hijacked for DOS attacks against unrelated machines.
Our design goals included modularity, minimizing risk DOS attacks (against or as agent thereof), flexibility of deployment, clean interfaces, and applicability to most tools.
Note, we fully expect to design an additional module, currently represented by a dotted line, that passes through A and P unchanged, but links B to O.
Here's the protocol diagram we came up with on 5/16/03:

Here's a revised version of the Architecture diagram I did after the fact to match up with the protocol diagram:

A. Here is who I am, and I want "this result" to exist (characteristic)
and that includes the IP addresses of the two routers [[may be 1 if
it's a passive result]]
- newness of test result
B. *no, go away [rejected (and here's why)]
*yes, result exists (with pointer)
*be patient, I will send response later (maybe with time estimate)
[Q: send response later or ask to call back?]
C. I would like to contact the PMC associated with this router,
here's who I am, and here's who I am asking on behalf of,
and here's the tool I'd like to run
D. *Get lost [not authorized, tool unavailable]
*IP address of sink PMC, capability to use
[or under hood, tell sink PMC to trust source PMC for a little while]
E. Initiate tool
- request interface token
- tool and argments to run
- Sink PCM IP address, capability {is a token}
F. here's my capability, can we start tool X with certain parameters now?
[[need source PMP IP address?]]
G. Start receptor
- prepare to receive tool
- IP address of source could be parameter [[never given?]]
- Identity of Sink PMC (token)
H. *OK
*Rejected can't do it
[[pmp doesn't do scheduling]]
I. *Rejected (don't like that capability; failed ot initiate receptor)
*willing and able, IP addr of sink PMP (really machine) to use
*Ask again in +T time units
J. Start tool:
- identity of source PMC (token)
- tool
- arguments for tool
- IP addr of sink PMP
K. Test stream from tool
L. Results
- identity of source PMP
- identity of sink PMP
- characteristic [[?]]
- tool output [[in standard form?]]
- [[accuracy, other ancilliary data that is relevant]]
(implicit ACK from DB, L is a stream)
M. Result report
- id of src PMP
- result ready, pointer to result (pointer contains capability?)
- result failed
-- test failed
-- write to DB failed
N. Result report
- id of src PMC
- result ready, pointer to result
- result failed
--test failed
--write to DB failed
(message B, actually is next)
O. Give me result
- who I am
P. *Rejected
*Result