The final version of the session I presented at IBM ConnectED 2015 in Orlando.
Abstract: With the reach of Voice and Video now including mobile devices enterprises are beginning to see these as a must have. But beyond the basics how does it scale up? In this session we focus on the real world demands for Voice and Video – what users really want – including the Sametime 9 Mobile Client – and what enterprises really need including not only scalability but also redundancy and security. The architecture and lessons learned from real world deployments will be discussed to give a heads-up on what to expect in both on prem and in the cloud. The systems featured in the presentation span the entire range of voice and video from Conference, Video and Bandwidth Managers to Edge and TURN servers.
2. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
3. Bigger on the Inside
More to unlock - which you have already paid for - than you may have previously thought
Simple per-user licensing
No additional software cost to add voice and video
No additional software cost to cluster for scale and reliability
Mobile device access is always included
4. Bigger on the Inside - Sametime AV Product Options
“Sametime Audio Video” : Connect Client to Client AV “calls”, click user (no number)
“Sametime Voice” / “ST ” / “SUT-Lite” : Client calls to/from Phones
and external Video System/Clients – by numbers/SIP URIs (sip:...@...)
Android / iOS Mobile Clients provide connectivity for the Mobile User
Sametime Meetings offers a zero-download AV browser client
Sametime Video Manager/MCU will talk to any/all such clients for Conferencing
(Full-Fat) Sametime Unified Telephony : phone control (full telephony)
All of the above uses SIP, SDP and RTP – for more details see last year’s presentation
http://www.slideshare.net/kbmsg/jmp206
CommunicateConference
COMPLETE
SUT
5. Crash Recap of Voice, Video, Conferencing and Telephony Terminology
SIP - Session Initiation Protocol: standard for making calls (sessions) between
endpoints using INVITEs, endpoints which may move typically REGISTER first
SDP - Session Description Protocol: standard for describing audio/video/etc sessions
(S) RTP - (Secure) Real-time Transport Protocol: standard for sending/receiving
audio/video/etc in packets
Codec - standard for packaging audio/video – G.711 is telephone quality voice,
G.729 (patented/licensed) and iLBC (open source/free) are highly compressed audio
MCU - Multipoint Control Unit: audio/video mixer for conference calls
TLS - Transport Layer Security: encryption standard providing secure communications
Early versions of TLS were called SSL (Secure Socket Layer).
6. Do you want to cut costs by reducing phone handsets?
– Does Sametime Voice and Video therefore need to be as reliable as your PBX?
– Can ST fit into your dialplan?
Or cut costs by centralizing external calls?
– Watch out for internal billing issues as well as regulatory restrictions
– If you want to keep external calls routing out each site configuration it is very complex without SUT
Or cut costs by using internal conferencing?
Or simply Improve Collaboration?
KickOff: Consider Your Raison D’Etre for Sametime Voice and Video
7. KickOff: When you think you know what to do, Assume you don’t!
Hold at least one full day workshop with all parties - including decision makers - to
– Discuss functional as well as non-functional (scale, resilience, security) requirements
– Ensure everyone is aware of all the possibilities
Compile, document and plan to perform a comprehensive Test Plan
– Anything not tested is not guaranteed to work – therefore do and not just cover a few use cases
Plan a Pilot in an equivalent environment to production OR Plan a suitably sized
Reference/Staging environment – clustered and secure if these will be used in production
– reconfiguring and re-testing for complex issues in production is painful!
8. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
9. “When It’s Working, Troubleshoot It!”
Understanding the basic ways the calls flow
– Arm yourself for real troubleshooting
– Prepare your mind for the added complexities of clustering
10. ST Connect Client to Client Call Flow
Client
CS CM
SIPPR
BWM
Client
VP
VP
SIP
SIP
RTP
1
2
3
5
4
8
7
9
6
Client asks Conference Manager (CM) to set up a call
via Virtual Places (VP) request to Community Server
(CS) (1,2)
CM sends all SIP requests through SIP Proxy and
Registrar (SIPPR) (3,6)
SIPPR may consult with Bandwidth Manager (BWM)
– a B2BUA* which can modify SDP or deny call (4,7)
CM/SIPPR sends requests to Caller Client first (3,5)
and then Called client (6,8)
Clients accept calls (200 OK) with media details in
SIP SDPs - these flow through the above paths in
ACKs and (re-)INVITEs, giving each client the details
Real Time Protocol (RTP) audio/video flows directly
from client to client (9)
* B2BUA = SIP Back to Back User Agent, this is two SIP User Agents (UAs) combined:
a User Agent Server (UAS) which receives a call and a separate User Agent Client (UAC) which
initiates a new call based heavily on the original call but modified as required
Video Manager not involved even for Video Calls
Two “calls” without BWM, Four with it
SIPPR and BWM “see everything” EXCEPT
conditions at/between the Clients
11. Conference Leg Call Flow
Client
CS CM
SIPPR
BWM
VP
VP
SIP
SIP
SIP
1
2
3
5
4
9
7
6
SIP
Client asks CM to set up calls via VP request to CS (1,2)
CM sends SIP requests to clients through SIPPR (3,5)
SIPPR consults with BWM if configured (4)
SIPPR sends request to Client and it accepts call (200 OK) (5)
CM sends call request direct to Video Manager (VMGR) (6)
VMGR sends new call request (like a B2BUA) to Video MCU
(VMCU) via SIPPR and BWM if configured (7,8,9) – VMCU
accepts call (200 OK), responding via port 15000
Confirmations (ACK) flow through the above paths, ultimately
exchanging client AV details with the VMCU in the SIP SDPs
RTP AV flows between VMCU and clients (10)
There are, as a result of BWM and VMGR, 5 calls/sessions here and even SIPPR cannot see the entire set of
calls/sessions (as CM talks directly to VMGR with call/session details different to the VMGR call to VMCU) –
without BWM there would still be 3 calls, two of which SIPPR would not see as CM would send one call to
VMGR and the VMGR would send a call with different call/session details to VMCU directly
Note: The CM, VMGR and VMCU also communicate with each other to ready conference bridges for use by
means of XML over HTTPS/HTTP on various ports such as 8443, 9443, 443 and 8080
VMCU
8
VMGR
10
Three “calls” without BWM, Five with it
SIPPR “blind” to CM <-> VMGR
VMGR always involved
12. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
13. “When It’s Fixed, Make it Mobile!”
Mobile Client access is available for all ST Packages: Communicate, Conference and Complete
Traversing the public internet, DMZ, etc. securely adds significant complexity
14. HRPS
External/Mobile Clients (Best Practice)
Mobile
Client
SIP
EDGE
TURN
DMZ Internet
Private Intranet /
Wifi / 4G etcCorporate Intranet
tunnelled RTP
SIP
STUN/
TURN/
ST
Proxy
CS
SIPPR
VMCU
VMGR
CM
HTTPSVP
Mobile Clients use an HTTP Reverse Proxy
Server (HRPS) to talk to the Sametime Proxy
Server which translates from HTTPS to Virtual
Places (VP), allowing Mobile Clients to access
all of the services of Community Server
Mobile Clients rely on SIP EDGE server for SIP
to reach SIPPR and TURN server for RTP to
reach intranet - such as VMCU or other clients
An External ST Connect Client would use a
Sametime Multiplexer (MUX) in the DMZ
instead of HRPS and ST Proxy but would still
use SIP EDGE and TURN servers
The Sametime Meetings zero-download
browser client plugin for AV also uses the
Sametime Proxy Server (and HRPS if external)
BWM
DB2
APNs
SIP
RTP
15. Client to External/Mobile Client Call
Client
CS CM
SIPPR
BWM
Client
VP
VP
SIP
SIP
SIP
SIP
RTP
1
2
3
5
4
8
7
6
SIP
EDGE
TURN
DMZ
Internet
PrivateIntranet/
Wifi/4Getc
CorporateIntranet
9
12
tunnelled
RTP
10
SIP
11
STUN/
TURN
Flow is as Client to Client
flow but SIP Edge server
handles SIP to
External/Mobile Client (9)
External/Mobile Client
uses Interactive
Connectivity
Establishment (ICE) with
STUN (Session Traversal
Utilities for NAT) / TURN
(Traversal Using Relay
NAT) server to determine
all RTP candidates (10)
before media flows -
which in this case uses
TURN server to relay the
RTP (11,12)
SIP
16. Conference Leg with External/Mobile Client
CS CM
SIPPR
BWM
Client
VP
from ST Proxy
or MUX in DMZ
VP
SIP
SIP
SIP
SIP
RTP
1
2
3
5
4
8
7
6SIP
EDGE
TURN
DMZ
Internet
PrivateIntranet/
Wifi/4Getc
CorporateIntranet
13
tunnelled
RTP
10
SIP
11
STUN/
TURN
9
VMGR
VMCU
SIP
12
Flow is as Conference
Leg flow but SIP Edge
server handles SIP to
External/Mobile Client (6)
External/Mobile Client
uses ICE with STUN /
TURN server to help
determine RTP
candidates (7) before final
negotiation of RTP
stream - which in this
case uses TURN server
to relay the RTP (12,13)
SIP
17. Considerations for Mobile/External Clients
Split Horizon DNS if have internal and external service availability – inside and outside addresses for:
– SIP Proxy and Registrar / SIP EDGE
– TURN Server (0.0.0.0 internally)
– Sametime Proxy Server / HRPS
– Sametime Meeting Server / HRPS
– Community Server / Mux
Consistent domain name (eg, thinkrite.com) for LTPA tokens to work correctly
TLS Certificates from official Certificate Authority using this consistent domain name
For STUN/TURN no NAT can be configured and firewalls must be in transparent/bridging mode
as Clients must be able to connect to TURN servers in DMZ directly for STUN (3478) to work
VMCU must be able to talk to TURN in the same way and send/receive RTP (20830+/40000+)
18. Troubleshooting Mobile/External Clients
488 Not Available Here often indicates an unexpected failure to establish AV via ICE/STUN/
TURN – check that TURN server (via its hostname on STUN port 3478) AND other Client is
reachable (Firewalls / NAT / VPNs / routing / DNS may prevent it – this may not be immediately
evident as both Clients may be able to chat through CS/MUX/Proxy, reach SIPPR/EDGE, etc.)
ICE time-out errors – AV may still be established – network/negotiations may be strangely slow –
try changing RTO in Media Manager ICE properties in Sametime System Console to 500
19. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
20. Scaling Up SIPPR/CM: WAS-SIP Container-Based Servers
WebSphere Application Server can be
clustered vertically (on same machine) or
horizontally (on different machines) – in either
case the active memory is shared
For SIP Applications the amount of
communications to share active memory
between physical machines is very high
WAS Clusters must be fronted by a WebSphere
Proxy Server (simple to create using
SSC/Deployment Manager) with the main IP
address, this is a stateless SIP Proxy which
load balances WAS instances and offloads the
actual TCP/IP or TLS connections from them
WPSWAS1
WAS2
WAS Cluster
Shared Environment
WS Proxy Server
Distributes Load,
Maintains Session
TCP/
UDP/
TLS
21. Gotchas for Scaling up Conference Manager / SIP Proxy & Registrar
Without Clustering both CM and SIPPR can be on same server, but with Clustering they
must be in separate clusters
Limiting factor is the ability of WS Proxy to handle connections (now in
SIP/SIPS_PROXY_CHAIN > inbound channel, was 20,000 before)
OS capabilities may need to be tuned as may external factors such as LDAP
Installing multiple WAS instances on the same machine may result in port conflicts (can be
resolved by manual editing or WAS 8.5.5.2)
Some manual editing of files outside of SSC configuration is required - clustered CMs each
need a separate stavconfig.xml file with a different NotificationServerHost (CM’s own
FQDN) / NotificationServerPort (normally 9443)
– http://www-01.ibm.com/support/docview.wss?uid=swg21663243
22. How One becomes Many – Scaling Up
PS1CM1
CM2
CM WAS Cluster CM WS Proxy Server
PS2PR1
PR2
SIPPR WAS Cluster SIPPR WS Proxy Server
CM
PR
Clustered Media Manager
Standalone Media Manager
(Could also include SSC and DB2) SSC
DB2
SSC
DB2
SSC includes deployment
manager for all CM,
SIPPR, PS, etc.
:5080
SIP
SIP
:5060
:508x
:508y
:506x
:506y
:5060
:5080
23. Gotchas for Scaling up ST 9 SIPPR
Single Handled Domain must be configured for ST9
– Use the same FQDN as the DNS for SIPPR, same domain as in your certificates
– Clients/trunks setting this domain is all important – all incoming calls/SIP is expected to feature
this name in the Request URI/To headers for SIPPR to use rules to send calls to clients – all
other SIP will just be forwarded according to Request URI (which could result in a loop and 483
Too Many Hops if that address comes back to SIPPR itself)
– For Sametime Voice/Phone/SUT-Lite Conference Manager constructs a MESSAGE for client
notification based on the received INVITE, only sending it to the Proxy Registrar if the Request
URI for a received call matches the SIP Proxy Registrar FQDN shown in stavconfig.xml
sippr.thinkrite.com
24. Scaling Up VMGR and VMCU Servers VMCUs run on Linux only and are not
WebSphere/Java-based, they can be
configured in resource pools for specific
geographic areas or for other purposes
VMGRs while running with SIP in WebSphere
(on Linux only) do not use the WebSphere SIP
Container so cannot use the WebSphere Proxy
– they include their own load balancer
component running on ports 5080 and 7443
instead of 5060 and 8443
Solid database replicates information from
Master (M) to Hot Standby (HS) and other
Replicas (R)
VMGR1
VMGR2
VMCU1
VMCU2
VMGR3VMCU3
VMCU Farm
VMGR Farm
Distributes Load,
Maintains Session
VMCU pool 2
VMCU pool 1
:5060
:8443
:5060
:8443
:5060
:8443
:5060
:8080
:5060
:8080
:5060
:8080
VMGR
MLB
VMGR
HSLB
VMGR
RLB :5080
:7443
:5080
:7443
:5080
:7443
25. “End to End” AV Scaling
(without EDGE/TURN)
VMGR1
VMGR2
VMGR Farm
VMGR
LB1
VMGR
LB2
WPS3
BWM Cluster WS Proxy
PR1
PR2
SIPPR Cluster with WS Proxy
WPS2 CM1
CM2
CM Cluster with WS Proxy
WPS1
BWM1
BWM2
DB2
VMCU1 VMCU2 VMCU3
VMCU Farm
Client
Calls to Clients
Inbound Calls
(SUT-Lite)
Conference Calls
CS
CS
26. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
27. “When it’s Resilient, Break It!”
(Take full backups and test restore procedure first!)
Perform Failover testing, initially “gently” but also try more severe tests
Have clients logged in and make calls at the time of the tests to see what happens
28. Redundancy for WAS-SIP Container-Based Servers like SIPPR and CM
For a single IP address to reach these clusters
use simple Load Balancers/IP Sprayers such as
WebSphere EDGE Components LB for IPv4/6
or F5 BIG-IP LTM
WPS1
WPS2
LB1
LB2
WAS1
WAS2
WPS3WAS3
(Virtual IP
Address) Can
Failover to…
TCP/
UDP/
TLS
WAS Cluster
Shared Environment
WS Proxy Servers
Distribute Load,
Maintain Session
TCP/
UDP/
TLS
Load
Balancers
Sprays IP/details
from single address
VIP
29. Load Balancers
One Load Balancer server can theoretically be used for all Sametime Servers (a redundant
pair is obviously recommended!)
– Needs a FQDN and Virtual IP address for each Sametime Service (SIPPR, CM, VMGR, Proxy,
Meetings, TURN) – plus its own physical address(es)
MAC Forwarding – fastest option (and LB out of IP connection) but must be on same VLAN
– Necessary to set up a loopback (extra, non-ARP, not in routing table) IP address on the WS Proxy
etc. to receive packets from the Load Balancer
– LVS/IPVS uses same technique for “Direct Connection” (F5 calls this L2 nPath routing)
Other methods overcome VLAN/loopback limitations but are slower and interfere more
– Encapsulation/Tunnelling (F5 L3 nPath routing), NAT/SNAT (source address translation), etc.
– With SNAT must configure special settings in WS Proxy to rewrite packet details – IP address of
Load Balancer to FQDN of service
/etc/sysctl.conf / sysctrl –w net.ipv4.conf.all.arp_ignore=3 net.ipv4.conf.all.arp_announce=2
ip addr add $CLUSTER_ADDRESS/32 scope host dev lo
30. Gotchas for Scaling up
The Load Balancer must be extremely simple for the WS Proxy / Application Server logic to
work correctly
– Ideally just the Layer 2 (MAC) address details of the IP packet are changed to forward the packet,
allowing the WS Proxy to take over negotiating the entire TCP/TLS session
– If the Load Balancer is to actually read and forward a new TCP/TLS packet no SIP details should be
changed and no new headers should be added
– For F5 BIG-IP Local Traffic Manager (LTM) do not configure SIP / SIP Persistence / mblb profiles
as these result in LTM acting like a SIP UA/Proxy and Via/Record-Route headers are added – this
results in lost connections after around 5 minutes because:
- the WS Proxy detects this SIP UA is in front of the client and doesn’t add RFC 5626 flow tokens
- special TCP keep-alive messages on the SIP connection do not make it through the F5
31. WS Proxy Health Check Settings
An intelligent Load Balancer will only send packets to online WS
Proxies – which they can determine from responses to SIP
OPTIONS requests – the WS Proxy should respond
immediately to such OPTIONS
(in comparison the WS Proxy uses Distribution and Consistency
Services (DCS) rather than SIP to determine if its Application
Servers – eg, SIPPR - are running)
If you need to configure more than two addresses it is possible
to modify the comma separated LBIPAddr setting in the file
proxy-settings.xml – but returning to the configuration page will
remove all but the first two addresses
32. WS Proxy IP Forwarding Load Balancer and other Custom Properties
contactRegistryEnabled false for faster shutdown
disableAllHostNameLookups should be set to true for
performance, this does not affect the use of hostnames
in the below IPSprayer settings…
tcp/tls/udp.IPSprayer.host is the hostname of the virtual
IP of the load balancer – ie, for the SIPPR it is the
hostname of the address to which clients expect to
connect
ipForwardingLBEnabled true – replaces the host and
port from LB with the IPSprayer.host/port details
isSipComplianceEnabled false to avoid logging
interoperability events for TCP keep-alives, etc.
enableMultiClusterRouting true to allow (eg, keep-alive)
packets with apparently invalid routing info to SIPPR
http://www-01.ibm.com/support/docview.wss?uid=swg21666746
33. WS Proxy Custom Property for Older Clients
Older clients (Including ST 8.5.2 embedded in
Notes 9 – especially common on Linux where full
AV/SUT is not yet available in ST 9) need special
handling:
– Import
WebSphereSIPProxy/ConnectionReuseFilter.jar
from disk 1 of Media Manager as an Asset on the
WebSphere Proxy Server
– Configure a Business Level Application (BLA)
and BLA CU (Composition Unit) using this
artefact
– Set forceRport=true custom property
http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/install/inst_config_clus_av_sippr_wasproxy_filter.dita
34. How One becomes Many – SIPPR/CM Redundancy
CM
PS1
PR
PS1
LB1
LB2
CM1
PR1
CM and PR WAS Clusters WS Proxy Servers
Load
Balancers
VIP
CM
PS2
PR
PS2
CM2
PR2
CM
PR
Clustered Media Manager
Standalone Media Manager
(Could also include SSC and DB2) SSC
DB2
DB2
SSC
DB2
HADR
SSC includes deployment
manager for all CM,
SIPPR, PS, etc.
SIP
:5080
:5080
SIP
SIP
:5060
:5080
:5060
:508x
:506x
:508y
:506y
:5060
:5080
:5080
VIP
35. Redundancy for VMGR and VMCU Servers
VMCUs run on Linux only and are not
WebSphere/Java-based, they can be
configured in resource pools for redundancy
VMGRs while running with SIP in WebSphere
(on Linux only) do not use the WebSphere SIP
Container or WebSphere Proxy – they include
their own load balancers which are aware of
where requests were previously sent and are
being handled
For a single IP address to reach the VMGRs
use an IP Sprayer which is SIP (5080/5081)
and HTTP/HTTPS (7443) compliant (the same
as for other Sametime servers is fine)
VMGR1
VMGR2
IS1
IS2
VMCU1
VMCU2
VMGR3VMCU3
VMCU Farm
VMGR Farm
Distribute Load,
Maintain Session
IP Sprayers
Sprays IP/details
from single address
VMCU pool 2
VMCU pool 1
(Virtual IP
Address) Can
Failover to…
VIP
:5060
:8443
:5060
:8443
:5060
:8443
:5060
:8080
:5060
:8080
:5060
:8080
VMGR
MLB
VMGR
HSLB
VMGR
RLB :5080
:7443
:5080
:7443
:5080
:7443
37. How Highly Available is a Clustered Sametime AV Environment?
Failover of a MAC-Forwarding Load Balancer should not affect calls
– Load Balancer is not involved in the actual connection, only new incoming connections
– Connection information can also be replicated from one Load Balancer to its partner(s)
Loss of a WebSphere Application Server should not affect calls – shared environment
– However some SIP being processed by that Application Server could be lost, disrupting call set-
up, tear-down or continuation of a very small number of calls
Loss of a WS Proxy will result in calls being lost
– Unless you use UDP (which cannot normally cope with the size of packets which include all the
Sametime Codecs) the TCP/TLS connection from the client was established to a specific WS
Proxy so if that goes down its connections are dropped
– Each connection is a client’s ability to make/receive/continue calls so any calls are lost and the
clients will have to re-REGISTER when they detect the failure (within 1 minute, configurable)
– WS Proxies can be clustered but this does not provide High Availability / Connection information
being shared or any method to maintain TCP/TLS connection
38. SIPSM2
SIPSM1
CSTASM1
High Availability Comparison – Sametime Unified Telephony
LB3
LB4
VIP
PR1
PR2
SIPPR Cluster with WS Proxys
WPS3
WPS4
Client
SIPSM1
SIPSM2
Active/Active Telephony Control Server
(TCS) Cluster 99.999% available
UCE1
VIP
FW1
MS1
FW2
MS2
Telephony Application Server Cluster with
Hot Standby:
Framework (FW) and Media Server (MS) on
one SAN partition and WebSphere
Application Server (WAS) on another
WAS1
WAS2
VIP
VIP
VIP
SAN
System Automation for MultiPlatforms
(SAMP) and Reliable Scalable Cluster
Technology (RSCT) manages failover to
spare node
CSTASM2
Hot/Hot Solid DB replication
Hot/Hot Universal Call Engine (UCE)
with shared call context memory
SIP Service Manager (SIPSM) and
Computer Supported Telecoms Apps
Service Manager (CSTASM) can failover
Solid
DB
Solid
DB
UCE2
Softphone calls still go through SIPPR
Cluster
IP
PBXIP
PBX
CS
CS
FW?
MS?
WAS?
39. Comparing Other types of High Availability and Scalability
High Availability Disaster Recovery (HADR) replication for DB2 server pair with SAMP/RSCT
handling failover – no Virtual IP Addresses/Aliases – DB2 clients aware of both servers
VMware High Availability – much like SUT TAS but fails over the entire virtual machine
VMware Fault Tolerance – much like SUT TCS, second virtual machine in vLockStep
becomes active upon failure of first - but can only use one vCPU until SMP-FT in ESXi 6.0
40. Scalability and Redundancy for Other Sametime servers
SIP EDGE Servers scale up in the same way as SIPPR and CM using WS Proxy and LBs
Sametime Meeting Servers scale up in the same way using WS Proxy for HTTP and LBs
Bandwidth Manager can scale up in the same way but only with two nodes
– uses WAS7 so needs its own Deployment Manager to configure the cluster
Sametime Proxy Servers do not need WS Proxy Servers (just Load Balancers)
TURN Servers can be fronted by IP or MAC Forwarding Load Balancers
– http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/install/inst_config_turn_properties.dita
– Remember that no NAT can be configured and firewalls must be in transparent/bridging
mode, Clients must (appear to) be able to connect to TURN servers in DMZ directly
41. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
42. “When it’s Secure, Hack It!”
HTTPS / TLS / SRTP should be configured
– Force web traffic to SSL/TLS using boundary devices/firewalls
– Test media with TCP/RTP first and then switch to TLS/SRTP and re-test
– 3rd party devices may need certificates exchanged for TLS to work
– If need be can have some (eg, intranet to VCS) connections using TCP and others using TLS
Certificates from official Certificate Authority should be used on internet side
Discover what it takes to decode TLS using Wireshark
Discover what it could take to commit fraud or a DoS attack
Appreciate why you need to keep certificates and their (non-default!) passwords safe
Tighten security as a result of any findings and re-test to check nothing is broken
opensslpkcs12-inkey.p12-nocerts-nodes-outdecryptkey.pem
43. SSO and Securing anonymous access
Edit stavconfig.xml changing SIPAuthenticationType to LTPA if have configured SSO
Enable anonymous access by token authentication on CS to avoid DoS attacks
http://www-01.ibm.com/support/knowledgecenter/SSKTXQ_9.0.0/admin/config/st_adm_security_allow_token_auth_enable.dita
Ensure there is an anonymous user in LDAP
Put the shared key txt files in a directory which can be found – with appropriate
permissions - on both SIPPR and CM (not in regular WAS profile directories which are
unique per system) and set shared secret key paths in WAS Trust Association
Interceptors, restart SIPPR and CM and check stavconfig.xml has these paths
Set TURNTokenAuthEnabled=true if clients are all ST 9.0 (TURN authentication not
supported by previous clients)
For TURN server put file from SecretKeyPathForTurnAuthToken and key txt files in root
directory / and put filenames in TurnServer.properties
44. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
Outside the Box
45. Heads Up on Common Issues – IP Telephony
Restrictions on packet size (eg, UDP / SIP-aware firewalls) causes issues with the long list
of codecs, ICE/STUN/TURN candidates and encryption options in SIP from Sametime
Clients and VMCU – IP telephony may not have hit this issue in the same environment
G.729 is not currently supported except with SUT, iLBC is not yet supported in ST9, calls
over the WAN may be prevented from using G.711 – discuss your needs and options with IBM
Lossy codecs especially in combination or used twice (eg, on an external conference
bridge) may produce poor voice quality – ensure such use cases are evaluated
SIP session timers may provoke issues – set these low for testing and high for production
Test on/off hold, transfers, any conferencing and other special features like bridging, TLS...
46. Heads Up on Common Issues – WiFi and Firewalls
Corporate WiFi is a completely different environment to Mobile Data – test both!
Corporate Guest WiFi is another different environment - ensure the expectation and/or
testing receives focus early-on as changes in this environment is a sensitive area
WiFi in other environments (some airports, hotels, etc.) may also be too restrictive
– Using Mobile Data instead by switching off WiFi on phone would be expected to work
Move non-standard ports to 80 and 443 where possible to overcome firewall issues, or
specifically ask for firewalls to be opened for SIP (5060) and TURN (3478) and RTP
47. Lessons Learned in Hosting
VMCU really needs dedicated hardware meeting minimum spec (4 core, 8GB RAM) which
is best placed on-premises in customer data center to keep latency to a minimum
VMCU requires eth0 to be used for its connection to VMGR - this is not generally possible
to create without access to a Bare Metal Server (BMS)
Once you have one BMS get a second for redundancy and/or high speed (consistent
performance guaranteed iops) shared storage between them for clones
Reserving CPU, memory and bandwidth as documented are all important in high-
performance enterprise environments (much less so in small evaluations but reserve now or suffer later)
BMS reboots can cause datastore corruption – resist the urge to exploit simplistic
automated monitoring which can result in this!
48. Why I UNIX/Linux/AIX/…
Pick an OS which gives you fast, secure access to the command line and the ability to
troubleshoot the entire foundation of the system from that command line including the boot
process and background processes
Standardize on one OS … logically RHEL or SLES by virtue of VMGR / VMCU
Make exceptions where necessary (eg, Document Conversion, ability to restart services
without restarting entire Community Server)
49. OS Tips
Use bonding to both protect against physical adapter failure and simplify virtual machine
cloning
Reduce TCP keepalive time to prevent backed-up queues –
net.ipv4.tcp_keepalive_time=60
Reduce TCP final timeout to allow connections to end faster – tcp_fin_timeout=30
Check and increase default system limits – ulimit / limits.conf / syctl.cnf
Use LVM with ext3/ext4 and leave space for snapshots
Install wireshark before you need it
50. Draw a diagram
Draw (or purloin) some deployment diagrams to share with IBM support – they will ask you
for them
– at a minimum include all Sametime components, proxies, load balancers
– if possible include additional detail on firewalls, VPNs, etc.
51. Bigger On the Inside
When It's Working Troubleshoot It
When It's Fixed Make It Mobile
Bigger On the Outside
When It's Resilient, Break It
When It's Secure Hack It
Lessons Learned
52. Phones Outside the Box
The call flows we showed included only Clients – but the scenarios can also involve
Phones
Sametime Meetings can also call out to Phones with simple SIPPR rules
– Condition: Method=INVITE RequestURI=sip:[0-9]{6}@.*
– Destination: Request-URI pattern=sip:(.+)@.* Output pattern=sip:$1@ippbx.x.y.com:5060;transport=tcp
– Also set TelephoneConferenceEnabled=true in ConferenceManager.properties in
/opt/IBM/WebSphere/profiles/*/installedApps/*/ConferenceFocus.ear/ConferenceFocus.war
Phones can also call into conference calls set up by Sametime Meetings
– Condition: Method=INVITE RequestURI=sip:[0-9]{4}@.* Source Address=ippbx.x.y.com
– Destination: sip:stvmgr.x.y.com:5060;transport=tcp
53. ST telephone numbers
ST will normally REGISTER using what is in the telephoneNumber field from LDAP
In fact ST really uses whatever is in Person document cache – which is taken from the
Business Card
Business Card can be changed in SSC or by editing XML but the Telephone Number field should
normally show the PSTN number
If there is no (valid) telephoneNumber then some outbound calls may work using the e-mail
address registration for P2P calls – but a valid telephoneNumber is required for reliable ST
telephone number and/or sip dialling
Obviously the numbers in LDAP for ST must be unique!
54. How can you call ST from a phone?
Assuming a user has a real phone and its number is in telephoneNumber then calls to
telephoneNumber would go to the real phone
An internal dialling code could be used to reach the softphone instead if IP PBXes can transform
the dialled number (SIPPR configuration cannot transform a received to another number – the
“To:” header cannot be manipulated – what is received in INVITE must match what is
REGISTERed)
An external dialling convention is not possible but a call-forward on the real phone could reach
the softphone
SUT has superior support both for allowing a user to select their preferred device to receive a call
on and integration without call-forwarding, called and calling number translation, etc.
55. ST Plugin allows a field other than telephoneNumber for softphone
Custom Plugin allows use of other Business Card fields or other LDAP fields
Often users are allowed to edit their Telephone Number
– by using another field issues with user-edited numbers can be eliminated
56. Different capabilities are available with different vendors – external TCSPI integration
allows CS / CM to start conferences and provide moderator controls
Some integration can be achieved through sip addressing (ST )
– (Outbound) Condition: Method=INVITE Request URI=.*@x.y.com.*
– Destination: Request URI pattern=sip:(.+)@.* Output pattern=sip:$1@dma.x.y.com:5060;transport=tcp
– (Inbound) Condition: Method=INVITE Source Address=dma.x.y.com
– Destination: sip:stcm.x.y.com:5060;transport=tcp (Push Route)
It is also possible to integrate with 3rd party video clients through such a bridge
– (Outbound) Condition: Method=INVITE Request URI=.*.3pvc.*
– Destination: Request URI pattern=sip:(.+)@.* Output pattern=sip:$1@3pbrid.x.y.com:5060;transport=tcp
3rd Party Video Conferencing
57. Monitoring Inside and Outside the Box
ThinkRite managed services hinge on pro-active monitoring scripts and server dashboard
(not a product) which run 24/7 and notifies staff of potential issues – many scripts run
checks on the servers but dedicated SIP and VP (watchit) bots run on intranet and internet
and can send independent alerts, as can the dashboard itself if updates dry up
There are many interfaces which may be useful for monitoring, identify which you can use
– the Bandwidth Manager ISC and STDBBWM database are particularly useful for monitoring calls
db2 -x "select cast(fromuserid as varchar(50)),cast(touserid as varchar(50)),endtime,endreason from bwm_media_sessions where
starttime > (current_timestamp -1 day)"
– also logs of the Conference Manager in …WebSphere/AppServer/profiles/*/logs/STMediaServer
callsummary.log.0 conference.log.0
– (REST) APIs on Conference Manager, Video Manager, … (see Links)
Greatest assurance of course remains Connect Client tests for which watchit is invaluable
58. Useful Links
http://www.slideshare.net/a8us/utf-8enibm-sametime-9-voice-and-video-deployment
http://www-01.ibm.com/support/docview.wss?uid=swg27040186&aid=1
http://www-10.lotus.com/ldd/stwiki.nsf/xpViewCategories.xsp?lookupName=Voice%20and%20Video
– BWM deployment best practices, CM and VMGR REST APIs, new tricks for ST AV…
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/WebSphere+SIP+and+CEA/page/C
onfiguring+and+Deploying+WebSphere+SIP+Environments
http://www-01.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tsip_tunelinux.html
59. Links not relevant to Sametime AV
https://www.ibm.com/developerworks/community/wikis/home?lang=en#/wiki/WebSphere%20SIP%20and%20CE
A/page/Achieving%20High%20Availability%20with%20WebSphere%20Application%20Server%20SIP%20Contai
ner%20and%20F5%20BIG-IP%20Local%20Traffic%20Manager (does not apply to Sametime!)
http://www.f5.com/pdf/deployment-guides/ibm-sametime-dg.pdf (does not include SIP!)
60. Related Sessions
Mon 1:00pm Mockingbird 1 & 2 MAS204 IBM Sametime Deployment Do’s and Don’ts:
Tips, Tricks, Perils and Pitfals
Tues 1:00pm Swan SW 1-2 BP103 Solving the Weird, Obscure and The Mind-Bending
Tues 3:45pm Dolphin S Hem 1 ID102 IBM Sametime: Design and Implementation of a full
HADR Deployment
Weds 10:30am Mockingbird 1&2 ID109 Digital Nightmares – The Biggest Performance
Killers in Your Environment
Weds 11:45am Swan SW 7-10 ID112 Connect the Dots: IBM Sametime Audio/Video
Planning, Deployment, Troubleshooting and Beyond
Weds 1:30pm Dolphin S Hem 1 ID108 Mobile Security Roundup
61. Who Was That Man?
Jeremy Sanders, Msc (Proj Mgmt) is the Chief Technical Officer of ThinkRite Ltd
(UK/EMEA) and continues to work with the ThinkRite team to integrate and develop
enhancements for IBM SUT, Sametime Voice/Softphone (”SUT-Lite”) and IBM Unified
Messaging. He’s been involved with IBM in development, integration, support and
administration of what we now call Unified Communications for over 20 years.
For further details see the first few slides of last year’s presentation…
http://www.slideshare.net/kbmsg/jmp206
62. ThinkRite Ltd is the European division of ThinkRite Inc/ThinkRite Pty
ThinkRite provides Sametime/SUT installation services, managed services, hosting services,
development services and innovative products including ThinkRite Assistant – Single Click to
connect to all voice and web meetings using Sametime softphone and Mobile clients
http://www.thinkrite.com/brochures/ThinkRite%20Assistant%20Brochure.pdf
Think What?
One unique system for
internal and external
Secured VPN to connect to
Directory and PBX if needed
Available anywhere and on
mobile devices without VPN
access
Cloud 9.0
63. Engage Online
SocialBiz User Group socialbizug.org
– Join the epicenter of Notes and Collaboration user groups
Social Business Insights blog ibm.com/blogs/socialbusiness
– Read and engage with our bloggers
Follow us on Twitter
– @IBMConnect and @IBMSocialBiz
LinkedIn http://bit.ly/SBComm
– Participate in the IBM Social Business group on LinkedIn
Facebook https://www.facebook.com/IBMConnected
– Like IBM Social Business on Facebook