UrlShortenerBundle
![Software License][ico-license]
, (*1)
Getting started
1. Install
$ composer require leopardd/url-shortener-bundle
2. Register bundle
<?php
// app/AppKernel.php
public function registerBundles()
{
$bundles = [
// ...
new Leopardd\Bundle\UrlShortenerBundle\LeoparddUrlShortenerBundle(),
];
// ...
}
3. (optional) Setup parameter
// config.yml
leopardd_url_shortener:
hashids:
salt: new salt
min_length: 10
alphabet: abcdefghijklmnopqrstuvwxyz1234567890
4. Update database schema
$ php app/console doctrine:schema:update --force
Folder structure
Controller
DependencyInjection
Entity..................represent table structure
Event
Exception
Factory.................create Entity instance
Repository..............all interaction with database
Resources
Service.................contain business logic
Feature & Update & Note
- [x] Example project: leopardd/bitly
- [x] Support PHP 5.6, 7.0
- [ ] Message queue
- [ ] Caching (e.g. Redis)
- Scaling
- [ ] Separate host/db/table/id
- [ ] Docker swarm setup
- [ ] what happened when ID reach maximum of MySQL
- [ ] Many databases
- [ ] Random bulk generate URL for test
- [x] Case-sensitive
- [x] Remove trailing before processing
- [ ] Change naming
EncodeService
to ProcessService
- [ ] Test
EventDispatcher
- [ ] Using Symfony\Component\Validator as a service + JMS to validate data
- [ ] Implement FriendsOfSymfony/FOSRestBundle
- Properties
- [ ] Custom code
- [ ] Hits
- [ ] Expiry date
- Validate url
- [x] Is url
- [x] Remove trailing slash
- [x] No more than 255 character
- Research
- Test
- [x] phpspec
- [x] PHP CodeSniffer
Algorithm
Get "short-url" process
- Insert "long-url" into database then return
row-id
- Encode
row-id
then save it
Redirect "short-url" process
- Decode incoming "short-url" then we get
row-id
- Return item in that
row-id
Reason behind "row-id" approach
Short url: produce shortened version of url
1. Generate: produce a shortened version of the URL submitted
2. Lookup: when the shortened version is called, look up this reference in database then return it, (*2)
And the challenge is
1. Lookup time
2. Allow very very large number of unique ids and at the same time
3. Keep the ID length as small as possible
4. ID should be sort of user friendly and possibly a memorable (if possible)
5. Scale with multiple instances (Sharding)
6. What happens when ID reach the maximum value e.g. (if the length is 7 containing [A-Z, a-z, 0-9], we can serve 62^7 (~35k billion) urls)
7. Replication, database can be crashed by many problems, how to replicate instances, recover fast ?, keep read / write consistent ?, (*3)
In this bundle, we will focus on point 1.
How we can reduce loop-up time., (*4)
Table stucture
id: number
code: shortened version of long url
url: long url
Attempt 1
- Generate: random id
- Lookup: simple loop up (O(n))
Attempt 2
- Generate: hash function from long url
- Lookup: simple loop up (O(n))
Attempt 3
- Generate: hash function from long url
- Lookup: bloom filters
Attempt 4
- Generate: hash function from record id
- Lookup: decoding (O(1))
So, In this project decide to using "Attemp 4" by using Hashids for hash function (Hashids will generate unique code from difference id), (*5)
Algorithm