2013年6月20日 星期四

伺服器托管停了三小時感想

上星期我使用的hosting公司因為firewall問題, 令到我的客戶3小時不能使用, 以下是我的看法.


當天約15:30, 我的電話響起, 第一個客戶通知我不能連線, 我即時嘗試, 確實不能連線, 立即致電hosting公司被轉去留言, 連hosting公司的網站也不能連到, 電郵沒有回覆(相信MAIL SERVER也死埋).

從半年前開始了伺服器租用服務, 我成功說服6個客戶租用MemDB系統, 當中有個客戶是十個USERS, 在這段時間我的電話不斷響起, 我知道我有責任處理它, 所以一一接聽和SAY SORRY, 但那個hosting公司就完全沒有理會我, 我只想了解原因和SERVER何時可以恢復. 這問題我被客戶問了多次, 但我也不能答覆, 只能說已在搶修中...

以下是那hosting公司的之後給我的report:


Picture

在這三小時裡, 我的客戶全改為人手開單, 之後把這些單入回系統, 大家可了解他們花了多少成本, 對此我也很抱歉.

這問題很少會發生, 但如果發生了是否可處理好些嗎? 首先HOSTING公司是否應該接聽我的電話, 告訴我發生了甚麼事? 因為我都是做這行, 清楚很多技術問題是不能避免, 但都要讓客戶了解才對.

另外看到以上REPORT, 代表那HOSTING公司找VENDOR處理問題, Vendor要3個小時到data centre才能解決(因為他們都REMOTE ACCESS不到部電腦)

在這個Internet時代, 咁既服務注定失敗, 幸好我著意不把HOSTING公司公開, 否則對他們的損失將會很高, 因為在網上人傳人的力量很大.

在這星期裡, 我已找了另一間HOSTING公司, 也找了一個NETWORKING的人才合作, 我覺如果有心做好HOSTING公司, 有很多AREA可改善和提高利潤, 期望可以和這夥伴合作發展這領域, 日後再和大家分享.

7 則留言:

  1. 我覺得這成本是相對的,今天如果你需要幾乎不中斷的服務,那你就需要付出更高的租金,相對來說,你得客戶也需要付出更多的成本,那也許他就會考慮要不要租用這樣的服務

    回覆刪除
  2. 奇怪係而家DATA CENTRE 可以一個 OPERATOR都唔請!? 17分鐘可以搞掂既野要3個鐘.....

    回覆刪除
  3. I think the responsibility should not only go to the hosting company. I think there should be something to be concerned.. or maybe as an improvement.. as below,
    -Have SLA agreed with client
    To set the expectation of client to the application service. Not only concern the utility (function) but also warranty (non-functional - capacity, availability and etc..)
    -Have OLA / UC agreed by vendor management process with hosting company
    Are they providing HA facility like Non-single point of failure HW, local / Cross-site resilience and etc
    -Any option for customer to choose HA / Non-HA
    Sort of setting up the expectation of client, let them realize they get what they paid.
    -Is SCP in place for triggering
    What is the continuity plan in terms of application service?
    -Is BCP in place for triggering
    What is the continuity plan in terms of business process?
    Also, I would highly recommend you to treat this event as a opportunity and to provide consultant service to your customer to review the business operation continuity plan. I am sure you will find out more business opportunity in this action.

    回覆刪除
  4. you should responsible of some area that provide a disaster recovery, mean you will need to have two different provider of data center, and if one of them down, get your client connect to other server, surely you have already down your program for the database sync already, so it is not hard for you, it's just depend on cost.
    Peter Yim

    回覆刪除
  5. Peter :
    you should responsible of some area that provide a disaster recovery, mean you will need to have two different provider of data center, and if one of them down, get your client connect to other server...

    是, 經過今次事件... 已作出了很多改善和應變方法...

    回覆刪除
  6. 邊家來嫁?等我bad list 許

    回覆刪除
  7. 我唸當時hosting公司的電話都嚮個不停, 所以基本上沒法response
    另外系統的穩定性對一些行業的確很重要, 這解釋為何一般家庭醫生診所, 都沒進行全面電腦化的原因, 總不能因系統故障, 而叫到診的客戶改天再來~~ :p

    回覆刪除