If you’ve ever needed to investigate a site to site tunnel which is down at 05:00 in the morning, this might just help you get to the bottom of it a bit faster.
Steps to take
Access your firewall web interface and go to VPNs/Monitor Status
If you have inactive in SA status then phase 1 is not even completing. Also access the firewall via SSH and take a look at the events for the tunnel in question:
get event include <peer ip address>
If you see the following message:
Phase 1 SA (my cookie:0b4c1390) was removed due to a simultaneous rekey.
You can adjust the firewall to a different IPsec soft limetime (default is 10 seconds):
get ike soft-lifetime-buffer
IPsec Soft Lifetime Buffer is 10 seconds
set ike soft-lifetime-buffer 60
You can use the following Juniper docs to troubleshoot further, however if the entire site access is lost and phase 1 is not completing, it suggests ISP:
Do check that the issue is not with the ISP itself before spending to much time on the VPN itself.
- Can you reach the public interface of the firewall?
- Can you reach the public interface of the router?
- Does a trace route fail at a perticular device?
If you start to see issues with routing then the problem is with an ISP. Check if you can reach the router from multiple ISP’s, if you can’t then likely the issue is with the ISP hosting the site but if you can reach from some places but not others then the issue is likely to be the failing ISP route.
You will definitely need to raise the issue with the ISP / multiple ISPs to get the issue resolved as quickly as possible as it out of your control to fix!